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Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 
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Chairman:  Dr.  Gerhard  X.  Ritter 

Cochairman:  Dr.  Joseph  N.  Wilson 

Major  Department:  Computer  and  Information  Sciences 

Program  optimization  has  long  been  an  important  function  of  compilers.  Tra- 
ditionally, global  optimizations  have  been  accomplished  by  collecting  large  sets  of 
data  flow  information  items  about  the  various  statements  in  the  program.  This  dis- 
sertation provides  a new  approach  to  global  optimizations  by  introducing  a set  of 
provably  correct  code  transformations  which  can  be  combined  to  perform  most  of  the 
traditional  global  data  flow  code  optimizations. 

First,  a small  language  which  could  be  viewed  as  an  intermediate  code  of  an 
image  processing  language  is  defined  syntactically  and  semantically.  Next,  a group 
of  primitive  source-to-source  transformations  on  this  language  are  described  and  the 
cases  under  which  each  transformation  is  valid  are  proved.  Then,  these  primitive 
transformations  are  combined  to  yield  global  transformations  such  as  code  motion 
and  copy  propagation.  A new  result  called  loop-conditional  joining  is  also  developed 
from  the  primitive  transformations. 

Finally,  a prototype  system  using  these  techniques  is  developed.  It  enables  a user 
to  experiment  with  a variety  of  code  transformations  and  provides  some  assistance 
with  heuristics  designed  to  improve  code. 


IX 


CHAPTER  1 
INTRODUCTION 


While  straightforward  implementations  of  the  image  algebra  as  a means  of  speci- 
fying image  processing  algorithms  have  been  successful  in  providing  people  a uniform 
means  of  discussing  these  algorithms,  these  implementations  have  produced  some 
programs  which  are  highly  inefficient  in  using  machine  resources.  This  study  was 
motivated  by  the  desire  to  find  ways  to  optimize  image  algebra  code.  Unfortunately, 
most  of  the  traditional  optimization  techniques  were  ill-suited  to  the  gross  inefficien- 
cies introduced  by  direct  translation  of  image  algebra  programs.  Additionally,  the 
proofs  of  correctness  of  global  data  flow  optimizations  are  based  on  the  flow  of  the 
program  data,  rather  than  the  meaning  of  the  program.  This  work  presents  a new 
approach  to  code  optimization  designed  to  solve  both  of  these  problems. 

Traditionally,  code  optimization  has  been  classified  as  either  peephole , where  small 
pieces  of  code  could  have  relatively  small  changes  applied,  or  global , where  transfor- 
mations could  be  made  on  a larger  scale.  Determining  when  global  optimizations  can 
be  performed  has  previously  been  done  by  looking  at  the  results  of  global  data  flow 
analysis.  Rather  than  collect  the  large  sets  of  data  required  for  global  data  flow  anal- 
ysis, I collect  the  variables  set  and  used  by  the  execution  of  each  statement.  Using 
only  this  information,  primitive  transformations  can  be  performed. 

These  primitive  transformations  can  be  proven  to  be  correct  using  denotational 
semantics.  Previously,  any  proof  of  transformation  correctness  has  been  carried  out 
by  examining  the  program  flow  graph  without  regard  to  semantics.  These  prov- 
ably  correct  transformations  can  then  be  combined  to  give  many  of  the  same  global 
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transformations  as  those  provided  by  global  data  flow  analysis.  Since  the  basic  trans- 
formations of  the  system  are  so  small,  they  can  easily  be  rearranged  and  recombined 
for  the  task  at  hand.  Thus,  I have  developed  some  new  and  previously  unexploited 
transformations  which  are  highly  beneficial  for  image  processing  programs. 

I have  developed  a prototype  optimizing  system  which  implements  all  of  the  prim- 
itive transformations  and  a number  of  the  global  optimizations  which  can  be  built 
from  them.  This  system  allows  a user  to  experiment  with  a variety  of  combinations 
of  the  techniques  and  demonstrates  the  power  and  flexibility  of  this  approach. 

The  remainder  of  this  dissertation  is  divided  into  six  chapters.  Chapter  2 provides 
a brief  background  in  traditional  code  optimization  techniques,  semantic  approaches 
to  code  optimization,  and  image  processing. 

Chapter  3 provides  a small  language  which  could  be  viewed  as  a simple  inter- 
mediate language.  This  chapter  provides  the  syntactic  and  denotational  semantic 
definition  for  the  language,  along  with  a discussion  of  the  variables  set  and  used  by 
statements  and  some  preliminary  results  about  the  language. 

The  primitive  transformations  are  presented  in  Chapter  4.  A full  proof  of  the 
correctness  of  each  transformation  is  given  along  with  its  description.  Some  of  the 
more  beneficial  global  transformations  derivable  from  these  primitive  transformations 
are  presented  in  Chapter  5. 

Chapter  6 describes  the  prototype  optimizer  developed  from  these  transforma- 
tions. Along  with  describing  the  system  and  its  basic  operation,  it  discusses  some  of 
the  possible  combinations  of  the  transformations  and  examines  how  this  can  improve 
the  execution  of  a sample  program. 

Finally,  Chapter  7 presents  conclusions  and  suggestions  for  further  work. 


CHAPTER  2 

BACKGROUND  MATERIAL 


This  dissertation  combines  work  from  several  diverse  areas  of  computer  science. 
First,  there  is  much  work  previously  done  in  code  optimization.  Originally,  most 
optimizations  were  done  at  the  local  or  peephole  level,  to  correct  problems  of  one 
particular  compiler  or  to  fine  tune  for  one  particular  architecture.  In  the  early  1970s, 
there  was  great  interest  in  more  global  optimizations.  Both  of  these  types  of  opti- 
mizations are  discussed  in  Section  2.1. 

Most  of  the  work  done  in  optimization  is  based  on  a graphical  view  of  the  program 
being  optimized  and  computes  a variety  of  information  based  on  the  structure  of  the 
program  graph.  Instead,  I approach  the  problem  from  a semantic  view  and  look 
at  the  meanings  of  statements,  rather  than  their  positions  in  the  overall  program. 
Background  material  on  semantics  is  presented  in  Section  2.2.  Much  of  Chapter  3 is 
devoted  to  presenting  a denotational  semantic  background  for  the  language  discussed 
in  this  dissertation. 

Although  the  language  described  here  and  the  transformations  are  general  pur- 
pose, this  work  has  been  motivated  by  the  desire  to  improve  the  running  time  of  image 
processing  programs  as  implemented  in  the  image  algebra.  A brief  introduction  to 
image  processing  and  the  image  algebra  is  given  in  Section  2.3. 

Finally,  there  is  no  way  to  guarantee  that  a single  combination  of  transformations 
produces  an  optimal,  or  even  an  improved,  program.  Instead,  heuristics  must  be 
developed  to  actually  use  these  transformations  to  arrive  at  a new  program  with  a 
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shorter  execution  time.  This  involves  the  use  of  expert  system  techniques  as  discussed 
in  Section  2.4. 

2.1  Code  Optimization 

Straightforward  translation  from  a high-level  language  to  machine  code  almost 
never  produces  code  as  good  as  that  which  a human  machine  language  programmer 
could  write  for  the  same  task.  All  of  the  optimizations  discussed  in  this  section  are 
definite  code  improvements.  Some  of  them  also  have  relaxed  rules  for  application  if 
the  compiler  writer  is  willing  to  take  the  chance  that  the  code  may  be  degraded  in 
certain  instances  with  the  relaxed  rules.  A code-transformation  is  considered  by  Aho 
et  al.  [1]  to  be  an  optimization  if  it  meets  the  following  conditions: 

a.  The  transformation  preserves  the  meaning  of  the  program. 

b.  The  transformation  speeds  up  the  execution  of  the  program.  Some  transfor- 
mations may  be  undertaken  to  reduce  the  size  of  the  code  produced,  but  the  primary 
emphasis  is  on  speed. 

c.  The  execution  time  saved  by  the  transformation  is  at  least  as  much  as  the  time 
it  takes  to  perform  the  transformation. 

It  is  this  third  condition  which  has  kept  many  of  the  transformations  in  Chapter  4 
from  being  considered  as  optimizations  before  this  time. 

2.1.1  Peephole  Optimizations 

There  are  some  optimizations  of  statements  that  can  be  made  with  little  knowl- 
edge of  the  code  surrounding  the  statements.  These  are  known  as  peephole  optimiza- 
tions. Only  a few  instructions  (those  in  the  peephole)  need  to  be  examined  at  a time 
to  apply  these  measures.  Peephole  optimizations  are  described  below. 

Removal  of  redundant  stores  and  loads.  The  final  step  of  one  instruction  may  be 
a store  of  a value  and  the  first  step  of  the  next  instruction  a load  of  the  same  value. 
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If  this  is  the  case  (and  both  instructions  are  in  the  same  block),  the  load  instruction 
would  not  be  necessary.  (A  load  followed  by  a store  would  result  in  the  removal  of 
the  store  instruction.) 

Removal  of  unreachable  code.  Although  not  all  unreachable  code  can  be  deter- 
mined by  peephole  optimization,  some  can  be.  All  code  following  an  unconditional 
branch  and  before  the  next  labelled  statement  is  unreachable  and  can  be  removed. 

Flow-of- Control  Optimizations.  If  the  peephole  used  does  not  require  the  state- 
ments to  be  contiguous,  jump  sequences  can  be  examined  and  optimized.  A jump 
statement  to  another  jump  statement  can  be  replaced  by  a jump  to  the  final  desti- 
nation. This  may  result  in  the  removal  of  the  intermediate  jump  statement. 

Algebraic  simplification.  There  are  many  algebraic  identities  that  may  be  ex- 
ploited, but  the  most  common  ones  (and  therefore  the  most  beneficial  to  optimize 
for)  involve  statements  of  the  form  x :=  x + 0,  x :=  x * 1 and  x :=  x * 0.  These  can 
be  replaced  with  simple  assignments  or  removed  entirely. 

Reduction  in  strength.  Operations  which  are  considered  expensive,  such  as  mul- 
tiplication and  computing  squares,  are  replaced  by  equivalent  operations  using  less 
expensive  operators  (computing  a square  may  be  replaced  by  a multiplication  and 
multiplication  may  be  replaced  by  a shift  operation). 

Use  of  machine  idioms.  Different  target  machines  may  have  different  operations 
implemented.  Using  these  special  operations  may  improve  the  code. 

Details  on  all  of  these  can  be  found  in  Aho  et  al.  [l].  Actual  use  of  these  tech- 
niques for  the  languages  SIMPL  and  OS/360  FORTRAN  H,  is  reported  by  Lowry  and 
Medlock  [17]  and  Zelkowitz  and  Bail  [30].  Although  these  steps  seem  simple  com- 
pared to  the  optimizations  that  result  from  global  data  flow  analysis,  many  redundant 
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statements  can  be  produced  by  the  compiler  front  end,  which  performs  code  genera- 
tion for  each  statement  individually  without  considering  what  code  the  surrounding 
statements  have  generated. 

2.1.2  Global  Data  Flow  Analysis 

Global  data  flow  analysis  examines  the  definitions  and  uses  of  variables  in  a pro- 
gram. While  there  are  other  uses  for  data  flow  analysis,  some  of  which  are  discussed 
by  Muchnick  and  Jones  [19],  the  most  important  is  in  performing  global  program 
optimizations,  discussed  in  Section  2.1.3.  Data  flow  analysis  is  most  commonly  done 
using  elimination  methods.  Allen  presents  most  of  the  basic  concepts  of  data  flow 
analysis  [2].  Other  methods  have  been  developed  for  determining  the  same  informa- 
tion, but  using  different  algorithms.  A good  overview  of  elimination  methods  of  data 
flow  analysis  is  provided  by  Ryder  and  Pauli  [26]. 

Data  flow  analysis  is  based  on  simple  graph  theory.  It  begins  with  a control  flow 
graph  of  the  program.  From  that  graph,  using  one  of  the  techniques  discussed  in  the 
previous  paragraph,  one  first  identifies  loops  suitable  for  improvement  in  the  code 
and  then  computes  information  for  each  statement.  This  information  consists  of  the 
items  discussed  below. 

Reaching  definitions.  For  each  statement,  all  of  the  possible  definitions  of  every 
variable  will  be  calculated.  All  of  the  possible  definitions  for  a use  of  a particular 
variable  in  a statement  are  collected  into  a list  known  as  the  use-definition  chain,  or 
ud-chain. 

Live  variables.  A variable  is  considered  to  be  alive  at  a statement  if  it  could  be 
used  somewhere  in  or  after  the  statement. 

Definition-use  chain.  For  each  definition  of  a variable,  all  of  the  possible  uses  of 
the  definition  will  be  listed.  This  is  also  known  as  the  du-chain. 
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Available  expressions.  All  of  the  expressions  which  have  already  definitely  been 
computed,  with  no  possible  change,  are  calculated  for  each  statement.  This  will  allow 
redundant  subexpression  elimination  to  occur. 

Copy  statements.  For  each  statement,  all  statements  of  the  form  a :=  b which 
have  preceded  it  and  have  not  had  either  a or  b redefined  are  collected.  These  will 
be  used  in  copy  propagation. 

2.1.3  Global  Data  Flow  Optimizations 

Peephole  optimization  works  only  with  a few  statements  at  a time.  When  the 
additional  information  provided  by  data  flow  analysis  is  known  about  a program, 
additional  optimizations  are  possible.  The  two  most  important  collections  of  these 
are  the  Allen-Cocke  catalogue  [3]  and  the  Irvine  catalogue  [29].  These  catalogues  view 
the  optimizations  at  a very  high  level,  giving  more  of  an  idea  of  what  can  be  done 
rather  than  how  it  is  done.  An  overview  of  these  catalogues,  giving  the  traditional 
global  data  flow  optimizations,  is  presented  by  Kennedy  [16].  These  optimizations 
are  below. 

Redundant  subexpression  elimination.  A subexpression,  once  it  is  calculated, 
may  not  need  to  be  recomputed  when  it  is  used  again.  If  there  is  no  possible  change 
to  the  variables  in  the  subexpression  between  where  it  is  originally  computed  and 
where  it  is  recomputed,  a new  variable  is  created  and  the  value  of  the  subexpression 
is  assigned  to  the  new  variable.  Instead  of  recomputing  the  subexpression,  the  value 
of  the  new  variable  is  used.  This  was  first  discussed  by  Cocke  [7], 

Copy  propagation.  A copy  statement,  of  the  form  A :=  B,  can  be  removed  and 
all  uses  of  A replaced  with  B if  there  are  no  definitions  of  A or  B between  the  copy 
statement  and  the  uses  of  A.  This  will  not  only  eliminate  extra  copy  statements  and 
variables  the  programmer  (or  high-level  language)  may  have  produced,  but  will  also 
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eliminate  many  of  the  extra  copy  statements  produced  by  other  optimizations.  If  B 
is  a constant,  this  procedure  is  called  constant  folding.  Constant  values  may  be  sub- 
stituted into  expressions  wherever  possible  and  the  resulting  peephole  optimization 
may  reduce  entire  expressions  to  constants. 

Code  motion.  Statements  in  a loop  that  do  not  depend  on  the  variables  that 
may  change  in  the  loop  can  be  moved  outside  of  the  loop.  This  eliminates  multiple 
executions  of  a statement  that  only  needs  to  be  executed  once. 

Strength  reduction  of  induction  variables.  Induction  variables  are  variables  that 
depend  on  the  loop  variable  for  their  values.  They  are  typically  given  values  which 
are  some  linear  function  of  the  loop  control  variable.  Rather  than  recompute  this 
function  every  time  the  loop  control  variable  changes,  it  can  be  computed  once  at  the 
beginning  of  the  loop  and  incremented  each  successive  time  through  the  loop.  This 
will  replace  the  computation  of  an  expression  with  a simpler  statement. 

Elimination  of  induction  variables.  Induction  variables  may  in  some  cases  be  re- 
placed by  the  loop  control  variable  which  they  depend  on.  This  will  remove  a variable 
and  possibly  allow  further  optimizations. 

Dead  code  elimination.  If  the  du-chain  of  a statement  contains  no  entries,  the 
definition  in  the  statement  is  never  used,  and  therefore  the  statement  can  be  elimi- 
nated. 

Procedure  integration.  The  body  of  a procedure  can  sometimes  be  substituted 
for  the  procedure  call.  This  has  the  advantage  of  reducing  procedure  call  overhead, 
which  is  very  inefficient  in  some  compilers.  It  may  also  allow  other  optimizations  to 
occur  and  give  more  restricted  ud-chains  and  du-chains. 

Machine-dependent  optimizations.  If  something  is  known  about  the  target  ma- 
chine’s organization,  other  optimizations  to  take  advantage  of  the  machine’s  features 
can  be  made.  The  most  common  machine-dependent  optimizations  are  listed  below. 
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Register  allocation.  Different  machines  have  different  numbers  of  registers  and 
different  types  of  special-purpose  registers.  They  also  have  different  register  ma- 
nipulation instructions,  such  as  auto- increment.  If  the  optimizer  knows  about  the 
specifics  of  the  registers,  it  can  better  allocate  registers  to  avoid  redundant  store  and 
load  operations  and  to  use  these  specialized  instructions.  There  is  also  some  opti- 
mization which  can  be  done  without  knowing  all  of  the  details  of  a specific  machine. 
This  global  machine-independent  register  allocation  utilizes  usage  counts  to  deter- 
mine which  values  should  reside  in  a limited  number  of  registers  and  is  discussed  by 
Chow  [6]. 

Detection  of  parallelism.  Any  instruction  which  can  be  coded  as  a vector  opera- 
tion should  be  identified  if  the  target  machine  is  a vector  machine.  Methods  to  do 
this  are  discussed  by  Schneck  [27], 

A good  overview  of  the  rules  for  performing  subexpression  elimination,  copy  prop- 
agation, code  motion,  and  strength  reduction  can  be  found  in  many  introductory 
compiler  texts  [1,5].  Specific  work  in  the  implementation  of  copy  propagation,  dead 
code  elimination,  code  motion,  strength  reduction  and  elimination  of  induction  vari- 
ables and  register  allocation  has  been  done  by  Chow  using  U-Code  (described  in 
Section  2.3)  as  the  intermediate  language  [6], 

2.2  Program  Semantics 

While  there  has  been  previous  work  on  the  mathematical  background  of  the 
correctness  of  program  optimization  (including  an  entire  book  devoted  to  the  sub- 
ject [28]),  this  has  not  included  a formal  notion  of  the  semantics  of  the  program  being 
optimized.  Rosen  introduces  a high-level  approach  to  the  problem,  but  does  not  use 
any  sort  of  semantic  definition  of  the  language  with  which  he  is  working  [25].  All  of 
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these  works  have  been  based  on  an  informa,  ^ ^ ^ 
mean. 

Cousot  presents  some  of  the  ear, test  forma,  work  in  this  area  He  uses  an 

operational  semantic  framework  in  which  to  perform  program  a„a,yses.  , assert  that 

“ operational, y based  framework  does  not  yie,d  the  coherent  hierarchical  frame- 

work  t at  denotations  semantics  provides.  This  view  is  shared  by  a number  of 

o ers  [12, 9],  In  add.t.on,  the  current  popularity  of  denotations  definitions  cer- 
tainly  make  optimization  work  based  unnn 

, , “ Upon  them  m°re  appealing.  Donzeau-Couge 

as  explored  the  app.ication  of  denotations  semantic,  to  program  optimization  [10) 

e emons, rates  the  applicability  of  this  techni„ue  to  such  optimizations  as  constant 

propagation,  common  subexpression  determination,  and  invariant  determination,  but 

does  not  discuss  the  elimination  of  these  common  subexpressions  and  invariant  ex- 

pressions.  In  addition,  optimizations  such  as  code  motion,  loop  rolling  and  unrolling, 
etc,  are  not  discussed. 

Alcr.hr, 

Image  processing  has  two  main  goals.  First,  images  may  be  processed  to  enhance 
em  or  uman  use.  Image  processing  is  crucial  in  the  images  of  planets  sent  back 
y space  probes,  for  example.  Second,  images  may  be  processed  for  machine  in- 
erpretation.  Current  work  in  computer  vision  includes  medical  diagnosis,  military 
target  acq„isitio„,  robotics  navigation,  and  face  recognition  for  television  rating  ser- 
vices. Introductions  to  image  processing  in  genera,  can  be  found  in  a number  of 
texts  [4,13], 

Digital  images  consist  of  some  underlying  system  of  discrete  points,  called  pizels 
a ort  for  picture  elements,  each  of  which  has  a corresponding  value  in  the  image 

C0rreSP0ndi"*  "V  * *ome  brightness  indicator  or  infrared  reading  for 
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the  point,  or  even  a vector  of  values.  A common  image,  the  black  and  white  photo- 
graph from  newspapers,  has  a 2-dimensional  grid  of  pixels  and  gray  levels  indicating 
the  relative  brightness  or  darkness  as  the  values.  Some  common  image  processing 
techniques  include  detecting  edges,  smoothing  and  sharpening,  and  locating  features 
in  an  image. 

The  AFATL  Image  Algebra  was  developed  to  provide  a standard  mathematical 
environment  for  image  processing.  It  can  perform  any  gray  level  image-to-image 
transformation  and  has  the  advantage  of  having  a formal  mathematical  basis. 

The  most  basic  operand  in  the  image  algebra  is  the  image.  Images  can  have  many 
different  coordinate  sets  and  values.  A coordinate  set  (usually  denoted  X)  must  be  a 
compact  subset  of  IF1,  with  n most  often  being  2,  to  indicate  2-dimensional  images. 
The  image  value  set  (usually  denoted  F)  must  be  a groupoid.  The  most  common 
image  value  sets  are  integers,  natural  numbers,  real  numbers  and  vectors  of  integers, 
natural,  or  real  numbers.  An  F valued  image,  a,  on  a set  of  image  coordinates,  X,  is 
defined  to  be  the  graph  of  the  function  a : X — *•  F,  or: 

a = {(x,  a(x)) : x € X,  a(x)  e F} 

The  image  algebra  provides  numerous  types  of  image  functions.  Binary  operations 
between  images  include  +,  — , *,  A,  and  V.  These  functions  will  operate  pointwise  on 
two  images  with  the  same  coordinate  system.  There  are  also  elementary  functions, 
such  as  the  characteristic  function  (x)  and  the  sum  (E)  of  an  image.  The  charac- 
teristic function  of  an  image  will  be  a binary  image  (that  is,  an  image  consisting  of 
just  0 and  1)  which  is  1 where  the  pixel  meets  certain  requirements  and  0 everywhere 
else.  Thus  the  characteristic  function  x<z(A)  will  be  1 where  the  original  image  a 
has  a gray  level  less  than  or  equal  to  7 and  will  be  0 everywhere  else.  The  sum  of  an 
image  is  defined  to  be  Yhex  o.(x).  There  are  also  a variety  of  template  operations, 
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both  with  images  and  with  other  templates.  These  typically  require  subroutines  to 
implement  and  are  outside  the  scope  of  this  research.  A fuller  introduction  to  the 
image  algebra  is  presented  by  Ritter  and  Wilson  [23]. 

Currently  the  image  algebra  is  implemented  as  an  extension  to  FORTRAN,  and 
FORTRAN  programs  can  be  written  which  are  converted  to  standard  FORTRAN 
programs  by  the  Image  Algebra  FORTRAN  preprocessor.  A description  of  Image 
Algebra  FORTRAN  is  provided  by  Ritter  et  al.  [24].  Work  is  underway  to  implement 
the  image  algebra  with  Image  Algebra-C.  This  work  was  begun  by  Perry  [22].  Because 
so  much  of  the  image  algebra  has  the  potential  for  vectorization  and  because  of  the 
interest  in  parallel  architectures  for  image  processing  in  general  [11],  the  intermediate 
language  U-Code  [20]  was  enhanced  to  include  vector  instructions.  (The  addition  of 
vector  instructions  to  a language  is  discussed  by  Zosel  [31].)  The  resulting  V-Code 
serves  as  the  intermediate  language  of  Image  Algebra-C. 

While  the  image  algebra  provides  powerful  notation,  these  previous  attempts  at 
straightforward  translation  from  image  algebra  programs  to  lower-level  languages 
have  led  to  highly  inefficient  code.  (This  is  not  just  a shortcoming  of  the  image  al- 
gebra or  these  implementations.  Inefficient  translation  of  code  has  been  a problem 
for  almost  as  long  as  there  has  been  translation  of  code.)  Hence,  optimizations  are 
important  to  implement  for  the  image  algebra.  Inspection  of  existing  optimization 
techniques  showed  they  were  lacking  for  some  of  the  high-level  inefficiencies  intro- 
duced by  the  image  algebra.  Several  new  or  previously  unexploited  techniques,  such 
as  backward  copy  propagation  and  loop-conditional  joining,  are  needed  to  better 
improve  the  code. 
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2.4  Expert  Systems 

Although  this  dissertation  is  not  intended  to  contribute  to  the  field  of  artificial 
intelligence,  the  techniques  employed  in  building  simple  expert  systems  proved  quite 
useful  in  the  work  presented  in  Section  6.4.  Expert  systems  are  programs  which 
behave  as  a human  expert  would.  They  are  particularly  useful  in  situations  where  no 
algorithmic  solution  is  possible.  An  overview  of  expert  systems  is  presented  by  Hayes- 
Roth  et  al.  [14].  For  this  particular  expert  system,  the  simpler  reasoning  techniques 
of  MACSYMA,  as  discussed  by  Martin  and  Fateman  [18],  were  sufficient. 


CHAPTER  3 

LANGUAGE  DEFINITION 

This  chapter  introduces  the  terminology  used  in  this  dissertation.  A small  lan- 
guage is  defined  and  its  denotational  semantics  are  presented  in  the  style  of  de 
Bakker  [9].  The  language  provides  both  simple  and  indexed  integer  variables,  in- 
teger and  boolean  expressions,  the  basic  structured  statements,  an  empty  statement, 
and  an  assignment  statement.  This  is  intentionally  a simple  language.  However,  it 
suffices  for  the  ideas  presented  here.  If  it  were  more  complex,  the  proofs  in  Chapters  4 
and  5 would  be  much  more  involved,  with  few,  if  any,  benefits.  This  simplification 
of  a language  to  make  optimization  easier  is  not  without  precedent.  Rosen  discusses 
movement  of  optimization  decisions  from  compile  time  to  design  time  [25]. 

In  keeping  with  this  desire  for  simplicity,  the  language  is  restricted  to  programs 
containing  loops  with  bounds  fixed  at  the  time  of  loop  entry  and  does  not  support 
subprograms.  These  language  constructs,  though  amenable  to  the  kind  of  treatment 
given  other  constructs  presented  here,  introduce  a level  of  complexity  which  would 
greatly  complicate  the  proofs  presented  with  little  or  no  benefit  to  the  image  pro- 
cessing programs  being  considered. 

The  first  section  describes  the  syntax  of  the  language.  The  second  section  dis- 
cusses variable  locations  and  the  states  that  assign  them  meanings.  Section  3.3  defines 
a basic  complexity  measure  for  expressions  and  statements  in  this  language,  which 
will  be  used  by  some  proofs  in  later  sections.  The  syntax  of  substitution  is  given 
in  Section  3.4.  Sections  3.5,  3.6  and  3.7  give  the  semantics  of  the  language,  state 
variants,  and  substitution.  The  way  statements  affect  and  are  affected  by  the  values 
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Name 

Table  3.1.  Syntactic  Notation 

Description 

Typical  elements 

Icon 

Integer  constants 

m,  n 

Svar 

Simple  variables 

x,  y,  z,  u 

Avar 

Array  variables 

a 

Ivar 

Any  integer  variable  (Svar  U Avar) 

v,  w 

Iexp 

Integer  expressions 

s 

Bexp 

Boolean  expressions 

b 

Stat 

Statements 

Textual  substitution  into  a member  of  Iexp  (Bexp,  Stat) 
Left-hand-side  substitution  into  a member  of  Iexp 

S 

s[*i/y} 

V <Vi/y> 

stored  at  locations  is  discussed  in  Section  3.8  and  a static  approximation  of  this  is 
provided  in  Section  3.9. 


3.1  Language  Syntax 

A brief  description  of  the  notation  used  in  this  language  is  given  in  Table  3.1. 
Variables  in  this  language  are  members  of  the  set  Ivar  and  may  be  either  simple 
integer- valued  variables  (members  of  the  set  Svar,  x,  y,  x\,  etc.)  or  integer-expression- 
indexed  integer  arrays  (members  of  the  set  Avar , a,ax,  etc.). 

Definition  3.1.1  ( Integer  variables) 
v ::=  x | a[s]  . 

Integer  expressions  may  consist  of  variables,  constants,  binary  operations  and 
conditional  expressions.  The  set  Icon  will  contain  all  integer  constants  (m,n,mx, 
etc.)  while  Iexp  contains  the  integer  expressions  (s,sl5  etc.).  Actual  integer  values 
(a,  ax,  etc.)  are  members  of  the  set  V. 

Definition  3.1.2  (Integer  expressions) 
s ::=  v | m | sx  © s2  | if  b then  sx  else  S2  f i. 

(where  ® is  any  binary  operator). 
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Boolean  expressions  may  consist  of  the  boolean  constants  true  and  false,  rela- 
tional operations,  negation,  implication,  and  subrange  inclusion.  Boolean  expressions 
(b,bi,  etc.)  are  in  the  set  Bexp  and  map  to  truth  values  (/5,/?i,  etc.)  in  the  set  of 
truth  values  W. 

Definition  3.1.3  ( Boolean  expressions ) 
b ::=  true  | false  | s2  \ ~>b  | s in  (ax . . . s2). 

(where  @ is  any  relational  operator). 

Statements  (members  of  the  set  Stat,  5,  Si,  etc.)  consist  of  assignment  and  empty 
statements,  along  with  the  standard  structured  constructs  of  concatenation,  selection 
and  iteration.  A program  in  this  language  will  be  the  same  as  a statement. 

Definition  3.1. A ( Statements ) 

S ::=  v :=  s | Si;  S2  | if  b then  Si  else  S2  fi  | | for  x :=  Si  to  s2  do  S od 

(where  x does  not  appear  anywhere  else). 

This  language  provides  no  boolean  binary  operators.  If  and  is  needed,  the  state- 
ment if  b\  and  b2  then  Si  else  S2  fi  can  be  replaced  with  if  b\  then  if  b2  then 
Si  else  S2  f i else  S2  fi.  Similar  replacement  is  possible  for  the  or  operator.  This 
provides  conditional  evaluation  of  the  and  statement,  where  if  the  first  clause,  61, 
is  not  true,  the  second  clause  will  not  even  be  evaluated.  A number  of  high-level 
languages,  such  as  C and  Modula-2,  have  similar  conditional  evaluation. 

3.2  Variable  Locations  and  States 

The  set  LocV  contains  all  possible  variable  locations.  A state  is  a function 
<t:LocV— >V,  mapping  locations  of  variables  into  integer  values.  The  set  of  all  states  is 
£.  The  location  of  a variable  in  a state  a,  C(v)(cr),  is  an  intermediate  variable  (£,£1, 
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etc.).  For  simple  integer  variables,  the  location  of  the  variable  is  simply  the  variable 
itself.  With  array  variables,  the  location  of  the  variable  is  given  by  the  array  and  an 
integer,  the  meaning  of  the  index  for  a particular  state.  (Definition  3.5.1  explains  the 
meaning  of  expressions  in  a given  state,  71.) 

Definition  3.2.1  ( Location  of  a variable) 

r(  \(  \ — f x if  v = x e Svar 

\ <a,7Z(s)(cr)>  if  v = a[s]  E Avar 

For  any  program,  the  domain  of  a,  a subset  of  LocV  containing  all  of  the  interme- 
diate variables  of  the  program  assigned  values  by  <7,  is  assumed  to  exist.  Since  there 
is  no  need  to  declare  variables  in  a program,  there  will  be  no  error  handling  due  to  un- 
declared variables  or  out-of-bound  indices.  While  these  are  important  considerations 
in  practice,  they  are  not  important  to  the  goals  of  this  research. 

A state,  a',  that  assigns  the  same  value  to  all  but  one  location  a s another  state,  <7, 
is  known  as  a state  variant.  State  variants  will  be  important  in  defining  the  meaning 
of  statements. 

Definition  3.2.2  ( Variants  of  a state) 

For  each  a E E and  a E V we  write  cr{a/£}  for  each  element  of  £ which  satisfies, 
for  each  ^ E Loc  V: 


Aliasing  (two  or  more  names  for  the  same  memory  location)  can  cause  special 
problems  when  determining  if  a transformation  is  valid.  A transformation  which  may 
seem  to  preserve  the  meaning  of  a statement  (such  as  interchanging  x :=  7;  y :=  8) 
may  in  fact  change  the  meaning  if  there  is  aliasing  (in  this  case,  if  x and  y refer  to  the 
same  memory  location).  Although  this  language  is  simple  and  does  not  include  many 
of  the  features  which  introduce  aliases  (such  as  pointers  and  variable  parameters), 
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because  arrays  are  allowed,  it  is  possible  that  a[sj]  and  a[s2]  refer  to  the  same  element 
of  the  array  a.  To  avoid  possible  aliasing  problems,  the  concept  of  two  variables  being 
always  separate  is  used. 

Definition  3.2.3  ( Always  sevarate) 

If  V\  and  t>2  £ Ivar,  then  we  say  v\  and  are  always  separate,  written  sep(t>!,  u2)>  if 
there  is  no  a 6 E with  C(v\)(cr ) = C(vi)(er).  We  can  say  v is  always  separate  from 
a set  of  variables,  I,  written  sep (v,I),  if,  £ I,  sep(v,v i). 

Changing  a variable’s  value  may  also  change  its  location,  as  may  happen  with 
a[a[x]]  when  both  x and  a[x]  have  the  same  value.  Variables  of  this  form  are  the 
exception  to  many  of  the  following  transformations  and  are  said  to  be  self-referencing. 

Definition  3.2.1  (Strictly  non-self-ref erencina  variables ) 

A variable  v is  strictly  non-self-referencing  if  C(v)(a{a/ C(v)})  = C(v)(cr),  for  any 
state  a and  any  integer  a. 

Any  simple  variable  is  strictly  non-self-referencing,  as  is  any  indexed  variable 
whose  subscript  does  not  include  a reference  to  the  array  being  indexed. 

Another  form  of  variable  interdependence  which  may  cause  trouble  occurs  when 
changing  the  value  of  one  variable  may  change  the  location  of  another.  The  variable 
whose  location  is  changed  is  said  to  be  location-dependent  on  the  variable  whose  value 
changes. 

Definition  3.2.5  (Location-indevendent) 

A variable  v is  location-independent  of  a variable  w if  C(v)(a)  = C(v)(cr{a/ C(w)(cr)}) 
for  all  integers  a and  all  states  a. 
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Any  simple  variable  is  location-independent  of  any  other  variable.  An  array  ref- 
erence is  location-independent  of  another  variable  if  the  other  variable  is  not  in  the 
expression  indexing  the  array.  It  should  be  noted  that  location  independence  is  not 
a symmetric  relation.  The  variable  x is  location-independent  of  a[x],  but  a[x]  is  not 
location-independent  of  x. 

3.3  Complexity  Measures 

For  the  sake  of  inductive  proofs,  a structural  complexity  function  c is  introduced 
for  the  language  elements.  It  is  necessary  that  the  complexity  of  any  combination  of 
elements  have  a complexity  greater  than  the  elements  comprising  it. 

Definition  3.3.1  (Complexity  of  Iexp) 
c(x)  = 1 

c(a[s] ) = 1 + c(s) 
c(m)  = 1 

c(sx  ® s2)  = 1 + c(si)  + c(s2) 

c( if  b then  sx  else  s2  fi ) = 1 + c (b)  + c(sx)  + c(s2) 

Definition  3.3.2  (Complexity  of  Bexp) 
c( true,)  = 1 
c(false)  = 1 

c(sxQs2)  = 1 + c(sx)  + c(s2) 
c(~<b)  — 1 + c(b) 

c(s  in  (ax...  s2))  = 1 + c(s)  + c(sx)  +c(s2) 


Definition  3.3.3  (Complexity  of  Stat) 
c(v  :=  s)  = 1 
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c(Si‘,S2)  = c(Si)  + c(S2) 

c( if  b then  Sx  else  S2  fi)  = 1 + c (b)  + c(Sx)  + c(S2) 
c(D)  = 1 

c(for  x :=  Si  to  s2  do  5 odj  = 1 + c(sx)  + c(s2)  + c(S) 


3.4  Syntax  of  Substitution 


Substitution  of  expressions  for  simple  variables  is  important  in  defining  the  se- 


mantics of  loops.  It  will  also  play  a part  in  statement  interchange  and  absorption. 
There  are  two  kinds  of  substitution.  Square  brackets  ([  ])  are  used  to  denote  textual 
substitution  into  an  expression  or  statement  and  angle  brackets  (<  >)  denote  textual 
substitution  into  the  left-hand-side  of  an  assignment  statement. 

Substitution  for  array  variables  is  not  defined  here  for  a variety  of  reasons.  The 
definition  of  substitution  for  an  array  variable  traditionally  includes  conditional  ex- 
pressions, which  would  greatly  complicate  the  proofs  in  Chapter  4.  Substitution  for 
array  variables  in  statements  is  even  more  complicated.  This  added  complexity  would 
provide  little  new  functionality  to  the  language  since  loop  control  variables  must  be 
simple  variables.  (I  am  not  alone  in  this  exclusion  of  substitution  for  array  variables; 
de  Bakker  knowingly  omits  an  explanation  of  S[vi/u2]  as  well  [9].) 

Definition  3.A.1  (, Substitution  of  expressions  into  Iexp ) 


s if  y = x 
x otherwise 
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aUx^ls/y]  = a[s\[a/y]  ] 

m[s/y]  — m 

(si  © s2)[s/y]  = (si[s/y]  ®s2[s/y]) 

(ii  b then  Si  else  s2  fi)[s/y]  = (if  b[s/y]  then  sx[s/y]  else  s2[s/y]  fi) 

Definition  3. A. 2 (Substitution  of  expressions  into  Bexp ) 
true  [s/y]  = true 
fals  e[s/y]  = false 
(si%s2)[s/y]=  (sx[s/y]  % s2[s/y]) 

(~'b)[s/yj  = ~'(b[s/y]) 

(s  in  (sx...  s2))  [s/y]  = (s[s/y]  in  (sx[s/y]  . . . s2[s/y])) 

Left-hand-side  substitution  will  not  substitute  for  a simple  variable,  but  will  sub- 
stitute in  the  indexing  expression  of  an  array  reference.  Thus  a[x]  <s/x>  = a[s]  and 
x[s/x\  = s , but  x <s/x>  = x. 

Definition  3.&.S  ( Left-hand-side  substitution) 


Substituting  into  a statement  involves  substituting  for  all  variables  in  the  state- 
ment, much  like  substituting  in  an  expression.  The  difference  here  is  that  left-hand- 
side  substitution  is  performed  on  the  left-hand-side  of  assignment  statements  and 
textual  substitution  is  done  everywhere  else. 

Definition  3.1. A (Substitution  of  expressions  into  Stat) 

(v  :=  sx ) [s/y]  = v <s/y>  :=  sx  [s/y] 

(Si]  S2)[s/y]  = Sx [s/y];  S2[s/y] 

(if  b then  5i  else  S2  fi)  [s/y]  = if  b[s/y]  then  Si[s/y]  else  S2[s/y]  fi 
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(D)[s/y]  = D 

(for  x :=  to  S2  do  S od )[s/y]  = (for  x :=  -Si [-s/t/]  to  ^[s/t/]  do  S[s/yJ  od ) 

A statement  with  a pair  of  substitutions  (i.e.,  5[x/y][n/x])  may  simplify  to  a 
statement  with  a single  substitution  (i.e.,  S[n/y])  in  some  cases.  This  result  will 
be  used  in  other  proofs  in  later  chapters  (most  notably  the  proof  of  loop  joining  in 
Theorem  5.1.1)  and  is  presented  here  as  an  example  of  a complete  inductive  proof  in 
this  language.  In  later  inductive  proofs,  only  the  basis  cases  will  be  proven  because 
of  the  length  of  the  proofs  and  the  fact  that  there  is  very  little  of  interest  to  be  found 
in  the  inductive  steps. 

Since  statements  use  both  integer  and  boolean  expressions,  this  result  must  first 
be  shown  for  expressions.  It  is  provided  in  the  lemma  below.  This  lemma  refers  to 
ivar,  the  set  of  all  variables  in  a statement  or  expression.  A fuller  definition  of  ivar  is 
given  in  Section  3.9. 

Lemma  3.1.1 

b s[n/y]  = s[x/y][n/x]  provided:  x & ivar(s) 
and 

b b[n/y]  = b[x/y][n/x]  provided:  x £ ivar(b) 

Proof:  By  simultaneous  induction  on  the  complexity  of  s and  b. 

Basis:  c(s)  = 1 and  c(b)  = 1 
Case  1:  s = y,x  ^ y 


y[n/y]  = n 

(Def.  of  subst.  into  expr.  [3.4.1]) 

= x[n/x] 

(Def.  of  subst.  into  expr.  [3.4.1]) 

= (y[*/y])[»/*] 

(Def.  of  subst.  into  expr.  [3.4.1]) 

Case  2:  s = z,  z ^ y 

z[n/y]  = z 

(Def.  of  subst.  into  expr.  [3.4.1]) 

= x[n/x] 

(Def.  of  subst.  into  expr.  [3.4.1]) 

= (z[x/y])[n/x] 

(Def.  of  subst.  into  expr.  [3.4.1]) 
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Case  3:  s = 


m 


m[n/y]  = m 

(Def.  of  subst.  into  expr.  [3.4.1]) 

= m[n/x\ 

(Def.  of  subst.  into  expr.  [3.4.1]) 

= {™{xly])[nlx) 

(Def.  of  subst.  into  expr.  [3.4.1]) 

Case  4:  6 = true 

true[n/y]  = true 

(Def.  of  subst.  into  expr.  [3.4.2]) 

= true[n/x] 

(Def.  of  subst.  into  expr.  [3.4.2]) 

= (true[x/y])[n/x] 

(Def.  of  subst.  into  expr.  [3.4.2]) 

Case  5:  b = false 

false[n/y]  = false 

(Def.  of  subst.  into  expr.  [3.4.2]) 

= false[n/x] 

(Def.  of  subst.  into  expr.  [3.4.2]) 

= (f  alse[x/y])[n/x] 

(Def.  of  subst.  into  expr.  [3.4.2]) 

Induction  step:  Assume  that  s\n/y\  — s\xly\\n/x\  and  b[n/y]  = 6[x/y][n/x]  when- 
ever c(s)  < k and  c(b)  < k (k  > 0). 


Show  that  it  is  true  when  c(s)  = k + 1 and  c(b ) = k + 1. 


Case  1:  s = a[si] 
a(5i][n/y]  = a[si[n/y]] 

= a[«i[*/y][n/*]] 

= <*[*i  [*/yj][n/xj 
= «[5i][x/y][n/xj 

Case  2:  5 = sx  ® s2 

{si  © s2)[nly}  = (sifn/yj  © s2[n/y]) 

= [s\[xly][nlA®  s2[xly\Wlx]) 

= (si [xly\@  s2[xly))[nlx] 

= (-si  © s2)[x / y\[n / x\ 


(Def.  of  subst.  into  expr.  [3.4.1]) 
(Induction  hypothesis) 

(Def.  of  subst.  into  expr.  [3.4.1]) 
(Def.  of  subst.  into  expr.  [3.4.1]) 

(Def.  of  subst.  into  expr.  [3.4.1]) 
(Induction  hypothesis) 

(Def.  of  subst.  into  expr.  [3.4.1]) 
(Def.  of  subst.  into  expr.  [3.4.1]) 


Case  3:  s = if  b then  si  else  s2  f i 

(if  b then  si  else  s2  fi )[n/y]  = if  b[n/y ] then  Si[n/y]  else  52[n/y]  fi 

(Def.  of  subst.  into  expr.  [3.4.1]) 
= if  b[xly][nlx\  then  Si[x/y][n/x\  else  s2[x/y][n/x]  fi 

(Induction  hypothesis) 

= (if  b[x/y ] then  Si[x/y ] else  s2[x/y]  fi)[n/x] 

(Def.  of  subst.  into  expr.  [3.4.1]) 
= (if  b then  Si  else  s2  f i)[x/y][n/x]  (Def.  of  subst.  into  expr.  [3.4.1]) 
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Case  4:  b = si0s2 
(*i@a2)[n/y]  = ai[n/y]0  s2[n/y] 

= ai[®/y][n/ar]0s2[x/y][n/x] 
= (MVy]©52[*/s/j)[n/x] 

= («i@a2)[a?/y][n/x] 

Case  5:  b = ->b 
(_,6)[n/y]  = ->(6[n/y]) 

= ->(6[x/y][n/x]) 

= (“,(M*/y3))[«/*] 

= (-’6)[ar/2/][n/a:3 


(Def.  of  subst.  into  expr.  [3.4.2]) 
(Induction  hypothesis) 

(Def.  of  subst.  into  expr.  [3.4.2]) 
(Def.  of  subst.  into  expr.  [3.4.2]) 

(Def.  of  subst.  into  expr.  [3.4.2]) 
(Induction  hypothesis) 

(Def.  of  subst.  into  expr.  [3.4.2]) 
(Def.  of  subst.  into  expr.  [3.4.2]) 


Case  6:  b = (s  in  (si . . . s2)) 

(s  in  («i  ...a2))[n/y]  = 

= (s[n/y\  in  (^i[n/y] . . . s2[n/y]))  (Def.  of  subst.  into  expr.  [3.4.2]) 

= (s[x/y][n/x]  in  (ai[x/y][n/x] ... a2[x/y][n/ar])) 

(Induction  hypothesis) 

= (s[x/y]  in  (si[x/y] . . . s2[x/y]))[n/x]  (Def.  of  subst.  into  expr.  [3.4.2]) 

= (s  in  (si . . . s2))[x/y][n/x]  (Def.  of  subst.  into  expr.  [3.4.2]) 


With  the  preliminary  result  proved  above,  the  following  lemma  can  now  be  proved. 

Lemma  3. A. 2 

(=  sfn/y]  = (sfx/y])[n/x] 

Provided: 
x ivar(S) 

Proof:  By  mathematical  induction  on  the  complexity  of  S. 

Basis:  c(S)  = 1 
Case  1 : S = D 
D[n/y]  = D 
= D[n/x\ 

= D[xly][nlx] 


(Def.  of  subst.  into  stat. [3.4.4]) 
(Def.  of  subst.  into  stat. [3.4.4]) 
(Def.  of  subst.  into  stat. [3.4.4]) 
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Case  2:  S = x :=  s 
( x ■=  s)[n/y ] = x <n/y  >:=  s[n/y] 

= x :=  s[n/y] 

= x :=  s[x/yj[n/x] 

= x < n/x  >:=  s[x/y][n/x] 

= x < x/y  ><  n/x  >:=  s[x/y][n/x] 
= (x  < x/y  >:=  s[x/y])[n/x] 

= (x  :=  s)[x/y][n/x] 


(Def.  of  subst.  into  stat.[3.4.4]) 
(Def.  of  subst.  into  l.h.s.[3.4.3]) 
(Previous  result  [3.4.1])) 

(Def.  of  subst.  into  l.h.s. [3.4.3] ) 
(Def.  of  subst.  into  l.h.s. [3.4.3]) 
(Def.  of  subst.  into  stat.[3.4.4]) 
(Def.  of  subst.  into  stat. [3. 4. 4]) 


Case  3 : 5 = a[sx]  :=  s 
(aM  :=  *)[n/yj  = a[sx]  < n/y  >:=  s[n/y] 
= <*K[n/y]]  :=  s[n/y ] 

= a[si[x/yj[n/x]]  :=  s[x/y][n/x] 

= a[si[x/yj]  < n/x  >:=  s[x/y\[n/x] 
= (a[si[x/y]]  :=  s[x/y])[n/x] 

= (a[sx]  < x/y  >:=  s[x/yj)[n/x] 

= (a[sx]  :=  s)[x/y][n/x] 


(Def.  of  subst.  into  stat.[3.4.4]) 
(Def.  of  subst.  into  l.h.s. [3. 4. 3]) 
(Previous  result  [3.4.1])) 

(Def.  of  subst.  into  l.h.s. [3. 4. 3]) 
(Def.  of  subst.  into  stat. [3.4.4]) 
(Def.  of  subst.  into  l.h.s. [3.4.3]) 
(Def.  of  subst.  into  stat. [3.4.4]) 


Induction  step:  Assume  that  S[n/y ] = S[x/y][n/x].  whenever  c(5)  < k. 
Show  it  is  true  when  c(S)  = k + 1. 


Case  1:5  = 5i;  52 
(5i;52)[n/y]  = 5x[n/y];  52[n/y] 

= ■s’i[;r/y][«/^];5'2[x/y][n/x] 

= (Si[x/y];52[x/yj)[n/x] 

= (■S,i;S’2)[ar/y][n/x] 


(Def.  of  subst.  into  stat. [3.4.4]) 
(Induction  hypothesis) 

(Def.  of  subst.  into  stat. [3.4.4]) 
(Def.  of  subst.  into  stat. [3. 4. 4]) 


Case  2:  5 = if  b then  5X  else  52  f i 

(if  b then  5X  else  52  fi)[n/y]  = if  b[n/y]  then  Sx[n/y]  else  52[n/y]  fi 

(Def.  of  subst.  into  stat. [3. 4. 4]) 
= if  6[x/y][n/x]  then  5x[x/y][n/x]  else  52[x/y][n/x]  fi 

(Induction  hypothesis) 

= (if  b[x/y]  then  5x[x/y]  else  52[x/y]  fi)[n/x] 

(Def.  of  subst.  into  stat. [3.4.4]) 

= (if  b then  5X  else  52  f i)[x/y][n/x] 


(Def.  of  subst.  into  stat. [3.4.4]) 
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Name 

. Table  3.2.  Semantic  Notation 

Description 

Typical  elements 

V 

Integers 

a 

W 

Truth  values  (T  and  F) 

0 

s 

States  (functions  from  LocV  to  V) 

a 

LocV 

Intermediate  variables  (Svar  U (Avar  x V)) 

Location  of  Ivar  (£:  Ivar  — ► (E  — > LocV)) 

£ 

Value  of  Iexp  (11:  Iexp  —►(£—*•  V)) 

n 

Value  of  Bexp  (W:  Bexp  —►(£—»•  W)) 

w 

Value  of  Stat  (M:  Stat  — > (S  — ► S)) 

M 

Variant  of  a state 

Constant  representing  the  value  of  the  integer  a 

a 

Case  3:  5 = for  x :=  Sa  to  s2  do  S od 

(for  x :=  mi  to  m2  do  5 od  )[n/y]  = for  x :=  S\\n/y\  to  s2\n/y\  do  Sfu/y]  od 

(Def.  of  subst.  into  stat.[3.4.4]) 
= for  x :=  Si[x/y\[nlx\  to  s2[xly\[nlx ] do  5[x/y][n/x]  od 

(Induction  hypothesis) 

= (for  x :=  s\[xly]  to  s2[x/y]  do  5[x/y]  od)[n/x] 

(Def.  of  subst.  into  stat. [3.4.4]) 
= (for  x :=  si  to  S2  do  S od)[x/y][n/x]  (Def.  of  subst.  into  stat. [3.4.4]) 


3.5  Semantics  of  the  Language 

Once  the  syntax  of  the  language,  states  and  substitution  is  defined,  the  semantics 
follow.  The  notation  used  in  defining  the  semantics  of  this  language  is  given  in 
Table  3.2.  The  meanings  of  integer  and  boolean  expressions  and  statements  are 
fairly  straightforward.  A bar  over  an  integer’s  value,  a,  will  be  used  to  indicate  a 
constant  with  the  value  of  that  integer. 

Definition  3.5.1  ( Semantics  of  Iexp) 

K(v)(a)  = cr(£(u)(cr)) 

71  ( m) (a ) = a,  where  a is  the  mathematical  constant  associated  with  m 
K(Sl  ®s2)(a)  = K(sx)(o)  © K(s2)(o) 

1Z(if  b then  Si  else  s2  = if\V(b)(cr)  then  'R.(si)(cr ) else  H(s2)(cr)  fi 
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It  is  further  required  that  all  expressions  evaluate  without  error.  While  there 
could  be  special  cases  for  statements  which  could  not  be  evaluated  (such  as  x/0), 
this  only  complicates  the  presentation  without  providing  greater  understanding  of 
the  correctness  proofs  here.  Thus  it  is  assumed  no  semantic  errors  will  occur  during 
program  execution  and  no  attempt  is  made  to  provide  meanings  for  erroneous  states 
or  conditions. 

Definition  3.5.2  ( Semantics  of  Bexp ) 

W(trne)(a)  = T 
VV(f  z.lse)(cr)  = F 

W(siBs3)(<t)=  (n(s1)(a)QTZ(s2)(a)) 

W(-*b)(<r)  = ->W(b)(a) 

W(s  in  (si...sa))  = Tl(s)(a)  > H(sx)(cj)  andH(s)(a)  < n(s2)(a) 

The  semantics  of  statements  is  also  unsurprising.  The  meaning  of  a statement  in  a 
state  <7  is  just  a variant  of  <7.  So,  for  any  statement,  M(S)(cr)  = cr{a  1/6}  • • • {<W6>} 
where  the  £,•  are  the  locations  of  the  variables  assigned  by  the  statement.  Empty 
statements  have  no  effect  on  the  state  in  which  they  are  executed.  Assignment 
statements  result  in  a variant  of  the  state  in  which  they  are  executed  by  substituting 
the  value  of  the  right-hand  side  of  the  assignment  for  the  location  of  the  left-hand 
side  of  the  assignment. 

The  for  statement  has  probably  the  most  interesting  semantic  definition.  If  the 
value  of  the  upper  bound  is  greater  than  or  equal  to  the  value  of  the  lower  bound 
in  the  state  in  which  the  for  statement  is  being  executed,  then  the  statement  in  the 
loop  will  be  executed  at  least  once.  The  for  statement  then  has  the  meaning  of  the 
same  for  statement,  with  one  fewer  iterations  of  the  loop,  followed  by  the  statement 
in  the  for  body,  with  the  original  value  of  the  upper  bound  substituted  for  the  loop 
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control  variable.  This  has  the  effect  of  making  any  assignment  to  the  loop  control 
variable  legal,  but  meaningless.  For  example,  the  statement  for  i :=  1 to  10  do 
x :=  100;  sum  :=  sum  + x od  has  the  same  meaning  as  the  statement  for  i :=  1 
to  9 do  x :=  100;  sum  :=  sum  + x od;  x :=  100;  sum  :=  sum  -f  10.  If  the  upper 
bound’s  value  is  less  than  the  lower  bound’s  value,  then  the  for  statement  has  the 
same  meaning  as  the  empty  statement.  Notice  that  since  the  loop  bounds  are  fixed 
at  the  time  of  entrance  to  the  loop,  it  is  impossible  to  have  infinite  looping. 

Definition  3.5.3  (, Semantics  of  Stat) 

M(v  :=  s)(a)  = o {H{s)(cr) / C(v){<r)} 

M(Si;S2)(<t ) = M(S2)  (M  (Si)  (a)) 

M( if  b then  Si  else  S2  fi  )(a)  = ifW(b)(a)  then  M(Sx)(a)  else  M (S2)(a)  fi 
M (D)(a)  = a 

M (for  x :=  Si  to  s2  do  5 od  )(cr)  = 
if '^-{s2)(cr)  > 7Z(si)(a)  then 

j\4  (f  or  x :=  si  to  s2  - 1 do  S od;  S['JZ(s2)(cr)/x])(o') 
else  Af  (D)(a)  fi 

Two  statements  are  equal  in  a state  if  they  have  the  same  meaning  in  that  state. 
They  are  equal  if  they  have  the  same  meaning  in  all  states. 

Definition  3.5.1  ( Equality  of  statements) 

Two  statements  S\  and  S2  are  equal  in  a state,  written  Sx  S2,  if  M (Si) (a)  = 

M(S2)(a). 

Two  statements  Si  and  S2  are  equal,  written  Si  = S2)  if  for  all  states  a € S, 
=„  S2. 

A transformation  of  a statement  Sx  into  a statement  S2  is  valid  in  state  a if  Si  =„  S2. 
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A statement  is  said  to  be  nullable  if  and  only  if  it  has  no  effect  on  any  state.  Nul- 
lable statements  are  of  the  form  x :=  x,  or  some  combination  of  nullable  statements, 
such  as  if  b then  x :=  x else  for  y :=  si  to  S2  do  y :=  y ; z :=  z od  f i. 

Definition  3.5.5  ( Nullable  statements) 

A statement  S is  nullable  iff  S = D. 


3.6  Semantics  of  State  Variants 


Once  the  semantics  of  statements  is  established,  some  obvious  results  about  state 
variants  and  their  meanings  can  be  proved.  First,  state  variants  can  be  interchanged 
when  they  refer  to  different  locations. 

Lemma  3.6.1  (Interchange  of  state  variants) 

(<7{Q!i/6}){a2/6}  = (<r{a2/(f2}){a1/(f1} 

Provided: 

6 ^ 6 

Proof: 

It  must  be  shown  that  (<T{a1/^1}){a2/6}(v)  = (crW6})W6  }(u)  <E  LocV. 


Case  1:  £ = 

Mai/6})  W6K0  = o’W^iKO 

= 

= o-W6}(ai) 

= (^{«a/6}){«i/6K0 


(Def.  of  state  variant[3.2.2]) 
(Def.  of  state  variant [3.2.2]) 
(Def.  of  constants) 

(Def.  of  state  variant [3. 2. 2]) 


Case  2:  £ = £2 

(*W6})W6}(0  = <7{<*i/6}M 

= a2 

= <r{a2/h}(() 

= (*W6})W6}(0 


(Def.  of  state  variant [3. 2. 2]) 
(Def.  of  constants) 

(Def.  of  state  variant [3. 2. 2]) 
(Def.  of  state  variant [3.2.2] ) 
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Case  3:  £ ^ and  £ ^ £2 
(°r{Q:i/6}){a2/6}(0  = °r{«i/£i}(£) 
= *(0 

= <7W6}(£) 

= (^{aa/6}){«i/6}(0 


(Def.  of  state  variant[3.2.2]) 
(Def.  of  state  variant[3.2.2]) 
(Def.  of  state  variant[3.2.2]) 
(Def.  of  state  variant [3.2.2]) 


The  value  of  a variable  in  a state  variant  follows  directly  from  the  definition  of 
state  variants. 


Lemma  3.6.2  ( Semantics  of  state  variants ) 

If  the  variable  v is  strictly  non-self-referencing  and  v is  not  location-dependent  on  x 
then 


- { *<*)(„)  Sr 

Proof:  This  must  be  shown  for  all  v € Ivar. 


Case  1:  C(v)((t)  = C(x)(a) 

IZ(v)((r{a/ C(x)(a)}) 

= <r{a/C(x){a)}(C(v)(a{a/C(x)(a)})) 
= <r{a/C(x)(a)}{C(v)(a)) 

= a{al  C(x){<t)}(x) 

= o-{a/£(x)(«r)}(£(x)(cr)) 

= Of 


Case  2:  £(u)(<r)  ± C(x)(a) 

K(v)((r{a/ £(x)(<t)}) 

= & {a  / C(x)(cr)}  (C(v)(cr  {a  / £(x)(cr)})) 
= o-{a/£(x)(cr)}(£(u)(cr)) 

= ^(C(v)(a)) 

= ^W(<7) 


(Def.  of  semantics  of  expressions[3.5.1]) 
(v  is  non-self-referencing) 

(Given) 

(Def.  of  location  of  Svar  [3.2.1]) 

(Def.  of  state  variant [3. 2. 2]) 


(Def.  of  semantics  of  expressions[3.5.1]) 
(Given) 

(Def.  of  state  variant[3.2.2]) 

(Def.  of  semantics  of  expressions[3.5.1]) 


Substituting  the  value  of  a variable  in  a state,  cr,  for  the  location  of  that  variable 
in  a does  not  change  the  meaning  of  cr. 
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Lemma  3.6.3 

<r{7 Z(v)(a)/C(v)(a)}  = a 

Proof:  It  must  be  shown  that  both  states  have  the  same  values  for  all  f G LocV 


Case  1:  £ = C(v)(cr) 

<r{K(v)(a)/C(  u)(<7)}(0  = H(v)(a) 

= <K0 


Case  2:  £ ^ £(u)(<t) 
v{K(v)(cr)/C(v)(cr)}(()  = <r(£) 


(Def.  of  state  variant  [3.2.2]) 

(Def.  of  semantics  of  expressions[3.5.1]) 
(In  this  case,  £ = C(v)(cr)) 


(Def.  of  state  variant  [3.2.2]) 


A state  variant  may  be  added  or  removed  if  there  is  a subsequent  variant  referring 
to  the  same  location. 


Lemma  3. 6. A (Introduction  and  elimination  of  state  variants) 

(^{ai/OM^/f}  = <y{a2K) 

Proof:  It  must  be  shown  that  both  states  have  the  same  values  for  all  £ LocV. 


Case  1:  = £ 

Wai/O)  Wf}(6)  = <*2 

= (<rW^})(£i) 


(Def.  of  state  variant  [3.2.2]) 
(Def.  of  state  variant  [3.2.2]) 


Case  2:  £x  ^ £ 

H«i/£})W0(6)  = (a{ax/0)(6) 
= °(tl) 

= (^W<0)(6) 


(Def.  of  state  variant  [3.2.2]) 
(Def.  of  state  variant  [3.2.2]) 
(Def.  of  state  variant  [3.2.2]) 


3.7  Semantics  of  Substitution 

Since  substitution  is  used  in  defining  the  semantics  of  some  statements,  it  is 
necessary  to  look  at  the  resulting  semantics  of  substitution.  For  boolean  and  integer 
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expressions,  the  value  of  the  expression  s (or  b ) after  syntactic  substitution  for  some 
variable  i in  a state  is  the  same  as  the  value  of  the  expression  in  the  state  with  the 
same  semantic  substitution  made  for  the  variable. 


Lemma  3.7.1  (, Semantics  of  substitution  into  expressions) 

/])(*)  = ft(s)(<r{ft(5l)(<7  )/C(y)(a)}) 

= W(&)(a{ft(ai)(*)/£(y)(<r)}) 

Proof:  By  pairwise  induction  on  the  complexity  of  s and  b.  Since  the  operations 

are  assumed  to  preserve  their  meaning  in  any  variant,  only  the  basis  steps  are  shown 

here. 

Case  1:  s = y 

ft(y[*i/y])(*)  = *(*)(*) 

= n(y)(a{7l(Sl)(a)/C(y)(a)}) 


Case  2:  s = x,  x =£  y 

/*])(*)  = K(z)((r) 

* *(*)(<r{K(*iX<r)/£(v)(*)}) 


Case  3:  s = m 
K(m[si/y])(cr)  = a 

= 'R'{m){<T{K(si)(<T)/C(y)(<r)}) 


Case  4:  b = true 
VV(true[si/y])(cr)  = T 

= W(tru.)WR(,1)W/£(»)(<r)}) 


Case  5:  b = false 
W(false[si/y])(<r)  = F 

= W(f  alse)(cr{7?.(s1)((7)/ £(y)(cr)}) 

□ 


(Def.  of  substitution  [3.4.1]) 
(Sem.  of  state  variant  [3.6.2]) 


(Def.  of  substitution  [3.4.1]) 
(Sem.  of  state  variant [3. 6. 2]) 


(Def.  of  constants  [3.5.1]) 
(Def.  of  constants  [3.5.1]) 


(Def.  of  true  [3.5.2]) 
(Def.  of  true  [3.5.2]) 


(Def.  of  false  [3.5.2]) 
(Def.  of  false  [3.5.2]) 
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Substituting  into  locations  is  a bit  more  complicated.  In  this  case,  only  variables 
can  be  substituted  (since  it  makes  no  sense  to  discuss  the  location  of  an  expression). 
Additionally,  there  are  three  possible  cases.  First,  the  original  variable  may  be  an 
indexed  variable.  (Recall  that  only  simple,  nonindexed  variables  can  be  substituted 
for,  so  no  indexed  variable  will  be  replaced  with  another  indexed  variable.)  Second,  if 
the  original  variable  is  simple,  there  are  two  cases;  the  old  location  can  be  identical  to 
it  or  always  separate  from  it.  Additionally,  left-hand-side  substitution  has  semantics 
very  similar  to  full  textual  substitution  into  array  variables. 


Lemma  3.7.2  (Semantics  of  substitution  into  locations ) 


£(ufai  h])W)  = 


£(ui)(<7)  if  v £ Svar,  v = y 

£(v)(a)  if  v £ Svar,  v ^ y 

C(v)(cr{'R.(vi)(a)/C(y)(a)})  if  v £ Avar 
C{v  <v1/y>)(c 7)  = Civ^ain^^/Ciy^a)}) 

Proof:  There  are  five  possible  cases  (the  first  three  for  regular  substitution  and 
the  last  two  for  left-hand-side  substitution). 


Case  1:  v = y 

£(y[vi/y])(<r)  = c(y)(tr) 


(Def.  of  substitution  [3.4.1]) 


Case  2:  v £ Svar,  v ^ y 
C(v[vi/y])(°)  = C{v){<t) 


(Def.  of  substitution  [3.4.1]) 


Case  3:  v £ Avar,  17  = a[s] 

£(a[s][ui/y])(<r)  = £(a[s[u1/y]])(cr) 

= < a,Tl{s[vily}){a)  > 

= < a,ft(s)(<7{ft(ui)(<7)/£(2/)(<7)})  > 

= £(a[s])(<r{  ^)(<r)/£(y)(<r)}) 


(Def.  of  substitution  [3.4.1]) 

(Def.  of  location  of  Avar  [3.2.1]) 
(Sem.  of  subst.  into  expr.  [3.7.1]) 
(Def.  of  location  of  Avar  [3.2.1]) 
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Case  4:  v (E  Svar 

C(v  <v\jy  >)(cr)  = £(«)(<r) 

= £(t>)(<r{K(u1)(<T)/£(!/)(<T)}) 


Case  5:  v € Avar,  u = a[s] 

£(a[s]  <v1/y>)(a ) = r (a[5[v1/2/]])(cr) 
= £(aM)W%)(.)/£(y)(.)}) 

□ 


(Def.  of  left-hand-side  subst.  [3.4.3]) 
(Def.  of  location  of  Svar  [3.2.1]) 


(Def.  of  left-hand-side  subst.  [3.4.3]) 
(Same  arguments  as  Case  3,  above) 


Defining  substitution  into  statements  also  involves  moving  the  substitution  from 
the  statement  into  the  state.  However,  since  the  meaning  of  a statement  results  not 
in  an  integer  or  boolean  value,  but  in  another  state,  simply  evaluating  the  meaning 
of  the  statement  at  a modified  state,  one  with  the  new  value  substituted  for  the  old, 
could  possibly  result  in  an  incorrect  new  state,  one  which  has  the  same  modification. 
For  example,  the  statement  (x  :=  y)[3/y]  evaluated  in  a may  seem  to  have  the  same 
results  as  evaluating  x :=  y in  cr{3/y}.  Both  M{(x  :=  J/)[3/y])(cr)  and  M(x  := 
y)(cr{3/y})  map  x to  3.  But  M((x  :=  y)[3/y])(cr)  maps  y to  whatever  value  it  has  in 
<t,  whereas  M(x  :=  y)(cr{3/y})  maps  y to  3.  Thus,  in  the  cases  where  the  statement 
does  not  reset  the  old  value,  it  is  necessary  to  reset  the  old  value  after  evaluating  the 
statement  in  the  state  in  which  the  substitution  has  taken  place.  This  is  described 
in  the  lemma  below. 


Lemma  3.7.3  (Semantics  of  substitution  into  statements) 


X(SWj/])(<t)  = 


(A4(5)(a{7e(Sl)(<r)/£(y)(<7)})){^(y)(<r)/£(y)(<7)} 
if'R(y)(a)  = H(y)(M(S)(cT)) 
M(S)(a{n(Sl)(cT)/C(y)(a)}) 

otherwise 


Proof:  By  induction  on  the  complexity  of  S.  Only  the  most  interesting  case, 
when  S is  an  assignment  statement,  is  presented. 
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Case  1:  H(y)(<r)  ± 7£(j/)(A4(5)(<7)). 

(Since  S is  an  assignment  statement,  it  must  be  of  the  form  y :=  s.) 

M((y  :=  s)[si/y])(<x) 

= M(y  <Si/y>:=  s[s1/y])((r)  (Def.  of  substitution  [3.4.4]) 

= M(y  :=  >s[si/y])(cr)  (Def.  of  l.h.s.  subst.  [3.4.3]) 

= v{K{s[si.l y])(cr) / C(y)(cr)}  (Def.  of  v :=  s [3.5.3]) 

= cT{n(S)(a{n(Sl)(a)/C(y)(a)})/C(y)(a)} 

(Sem.  of  substitution  [3.7.1]) 

= (a{K(S!)(ff)/£(!/)(^)}){K(S)(<T{K(61)(<r)/z;(!,)(<T)})/£(!,)(<T)} 

(Introduction  of  a variant  [3.6.4]) 

{K(S)(<T{R(Sl)(<T)/£(j,)(,T)})/£(I,)(^{K(3l)(<r)/£(!,)(ff)})} 

(Def.  of  location  of  Svar  [3.2.1]) 
= M(y  :=  s)(a{7J(s1)(<r)/£(y)(a)})  (Def.  of  v :=  s [3.5.3]) 


Case  2:  7l(y)(a)  = 7l(y)(A'l(S)(cr)).  (Thus  5 is  of  the  form  v :=  s and  y v.) 
M((v  :=  s)[si/y])(cr) 

= M(v  <si/y>:=  s[s1/j/])(<t)  (Def.  of  substitution  [3.4.4]) 

= <r{7l(s[si/y])(cr)/C(v  <Si/y>)(a)}  (Def.  of  v :=  s [3.5.3]) 

= )(<r)/£(y)(<r)})/C(v  < s\/y >)(a)} 

(Sem.  of  substitution  [3.7.1]) 

= <T{TC(S)(<T{R(Sl)(^)/£(j)(CT)})/£(v)(a{K(ill)W/£(»)(ff)})} 

(Sem.  of  substitution  [3.7.2]) 

{R(y)(<T{K(3)(a{7J(ai)(ff)/£(j/)(<T)})/£(t))(<r{K(Sl)(<7)/£(y)(<7)})})/ 

£(W(^{^(«)(<’{K(s1)(<r)/£(!,)(<T)})/£(<,)(<r{R(ai)(<T)/£(y)(^)})})} 

(Sem.  of  state  variants  [3.6.2]) 

= HW(»)(<r{W(*,)(»)/£(y)(<T)})/C(t.)(ff{R(<l)(ff)/£(y)(<r)})}) 

(»)(")/ 

£(»)(»W»)(<T{R(»,)(<T)/£(y)(<T)})/C(0)(<r{K(<1)(ff)/£(y)(ff)})})} 

(Sem.  of  state  variants  [3.6.2]) 

= (ff{R(»)(<r<W(»,)(ff)/£(y)(<7)})/£(»)(<T{W(i1)(ff)/£(y)(<r)})}) 

{R.{y)(cr) / C(y)(cr)}  (Def.  of  location  of  Svar  [3.2.1]) 

= ((<r{?l(5)(<7{7e(Sl)(<T)/£(y)(<7)})//:(U)(<r{^(5l)(<T)/£(y)(<r)})}) 
{7e(51)(cr)/£(j/)(<7)}){7^(y)(cr)/£(y)((7)} 

(Introduction  of  a variant  [3.6.4]) 

= ((<r{ft(ai)(ff)/£(y)(<r)}) 

{W(a)(a{^(aI)((r)/£(y)(a)})/£(t;)(a{^1)((T)/£(y)(a)})}) 

{Ti{y)(a) / C(y)(cr)}  (Interchange  of  state  variants  [3.6.1]) 

= (M(v  :=  s)(«7{^(s1)(a)/£(2/)(<T)})){^(r/)(<T)/£(y)((T)} 

(Def.  of  v :=  s [3.5.3]) 
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Although  substitution  into  assignment  statements  uses  a special  left-hand-side 
substitution,  in  some  cases  (most  particularly,  copy  propagation),  full  textual  variable 
substitution,  rather  than  left-hand-side  substitution,  occurs  in  the  left-hand-side  of 
assignment  statements.  This  strong  substitution  will  result  in  one  of  two  cases, 
depending  on  whether  or  not  the  original  variable  is  identical  to  the  old  variable. 


Lemma  3.7.1  (Semantics  of  strong  substitution  into  statements) 

If  v2  is  location-independent  of  V\  then 

M(v2  :=  s)(c{'Jl(v2){a)/C(x)(<T)})  if  vx  € Svar 

MMW*]  :=  *I»>/*1)(»)  = I (M(n  ;=  ,)(„{*(,*)(<,)/£(*)(<,)})) 

{7Z(x)(ct)/jC(x)((t)}  otherwise 


Proof:  There  are  three  possible  cases  for  £(t>x)  and  £(x). 


Case  1:  ui  G Svar,  v\  = x. 

M{yi[v2lx)  :=  s[u2/x])(cr) 

= °{K(s[v2/ x])\(t)  / C{yi[v2/ x])((t)}  (Def.  of  v :=  s [3.5.3]) 

= <7{^(5)(a{7e(t;2)(a)/£(x)(a)})/£(Ul[U2/a:])(<7)} 

(Sem.  of  substitution  [3.7.1]) 
= a{-R(s){<r{K{vf)(o )/£(x)(<t)})/£(u2)(<t)} 

(Sem.  of  substitution  [3.7.2]) 

= <T{7l(i)(<r{^(«2)(<T)/£(*)((7)})/£(w2)(<T{^(«2)(<T)/£(«1)((T)})} 

(Given) 

(Given) 

= M(v2  :=  s)(cr{7l(v2)(*)/£(x)(cr)})  (Def.  of  v :=  s [3.5.3]) 


Case  2:  € Svar,  v\  =£  x. 

M(v \[v2/x\  :=  s[u2/x])(<t) 

= A4(vi  :=  s[v2/ x\){(r)  (Def.  of  substitution  [3.4.1]) 

= <y{H{s[v2lx]){(T)IC{vl)(c)}  (Def.  of  u :=  s [3.5.3]) 

= (T{K(s){<t{'R(v2)(<j)/C(x)(<j)})/C(v1)(<7)} 

(Sem.  of  substitution  [3.7.1]) 

= (<T{^(x)(a)/£(x)(cr)}){7e(S)(or{7e(t;2)(<T)/£(x)(<7)})/£(t;1)((r)} 

(Introduction  of  a variant  [3.6.4]) 
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= ((*W*)(<r)/£(*)(< 7)}){^(x)(<7)/£(x)(<t)}) 

{7e(5)(a{^(u2)(cr)/£(x)(a)})/£(t;1)(cr)} 

(Sem.  of  state  variants  [3.6.3]) 

= ((^W^)(^)/^(^)(^)}){^)(<r{^(U2)(«T)/£(x)((T)})/£(  vi)(o-)}) 

{7?.(x)(cr)/£(x)((T)}  (Interchange  of  state  variants  [3.6.1]) 

= (H^(u2)(<t)/£(x)(<t)}) 

{^(s)(<j{^(n2)(cr)/£(x)(cT)})/£(t;1)((T{7e(u2)(cr)/£(x)((T)})}) 
{71(x)(<t)/£(x)(<t)}  (Given) 

= :=  ■s)(cr{^(u2)(o')/£(x)((T)})){'^.(x)(«r)/£(x)(cr)} 

(Def.  of  v :=  s [3.5.3]) 


Case  3:  v\  € Avar. 

Af(ni[n2/x]  :=  s[v2l  x))(cr) 

= a {7?.(s[u2/ x])(ct )/£(di[u2/ x])(<7)}  (Def.  of  v :=  s [3.5.3]) 

= <T{7e(5)(<r{^(t;2)((7)/£(x)(cr)})/£(l;1[l;2/x])(a)} 

(Sem.  of  substitution  [3.7.1]) 

= ^{^(5)(<7{7^(u2)((x)/£(x)(<7)})/£(i;1)((t{^(u2)(<t)/£(x)(<t)})} 

(Sem.  of  substitution  [3.7.2]) 

= (a{7Z(x)(a)/C(x)(cr)}) 

{^(■s)((T{^-(,;2)((T)/£(x)(it)})/ £(u1)(<7  {7£(u2)(<t)/ £(x)(ct)})} 

(Sem.  of  state  variants  [3.6.3]) 

= ((<^{R(»1)(<r)/£(*)(<T)}){W(*)(<T)/£(x)(a)}) 

|R(.s)(ff{'K(l.2)(a)/£(I)(<T)})/£(1;1)(<T{Kfe)(<7)/£(i)(<T)})} 

(Introduction  of  a variant  [3.6.4]) 

= ((a{K(v2)(o)/C(x)(a)}) 

CR(«)(^{R(t >2)(<’)IC(x)(c)})IC(vl){a{V.(v^)(c)IC(x)(a)})}) 

{R{x)((7) l C(x)(o)}  (Interchange  of  state  variants  [3.6.1]) 

= (M(v,  ~s)(o{K(v7)(a)IC(x)(v)))){n(x)(<,)IC(x)(„)} 

(Def.  of  v :=  s [3.5.3]) 


3.8  Sets  and  Uses 


In  examining  statements  for  possible  transformations,  it  is  often  necessary  to 
see  which  variables  they  manipulate.  Statements  can  manipulate  variables  in  two 
different  ways — they  may  simply  reference  the  variable’s  value  without  changing  it, 
or  they  may  actually  give  a variable  a new  value.  The  first  will  be  referred  to  as  a 
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use  of  the  variable  and  the  second  a set  of  the  variable.  A definition  of  using  and 
setting  variables  is  given  below: 

Definition  3.8.1  ( Setting  and  using) 

Let  $ (E  S — ► E and  x € Ivar. 

a)  $ sets  x whenever,  for  some  a,  $(<t)(x)  ^ (<t)(x). 

b)  $ uses  x whenever,  for  some  <y  and  a,  $(a{a/C(x)(<r)})  ^ $(cr){a/£(x)(cr)}. 
The  set  of  all  variables  set  and  used  by  a given  function  will  be  written  sets(<f>) 
and  uses($)  respectively. 

While  expressions  do  not  set  variables’  values,  they  certainly  do  use  the  values 
of  variables,  so  the  definition  of  uses  is  extended  to  the  functions  IZ  and  W in  the 
following  definition. 

Definition  3.8.2  (Using) 

7l(s)  uses  x whenever,  for  some  a and  a,  TZ(s)(a  {a  / C(x)(a)})  ± 1Z{s)(<t).  The  set 
of  all  variables  used  by  7Z(s)  will  be  written  uses  (s). 

W(6)(cr)  uses  x whenever,  for  some  a and  a,  W(b)(a{a/ C(x)(cr)})  ± W(b)(a).  The 
set  of  all  variables  used  by  W(6)  will  be  written  uses  (b). 

If  an  assignment  statement  v :=  s is  not  nullable,  then  uses  (u)  and  uses  (s)  are 
subsets  of  uses  (v  :=  s). 

The  integer  and  boolean  operations  in  this  language  have  the  same  meanings  in 
all  states,  so  if  a variant  does  not  change  the  locations  of  the  variables  or  values  at 
those  locations  used  by  an  expression,  it  will  not  change  the  value  of  the  expression. 


Lemma  3.8.1  (Eaualitv  under  state  variants) 
If  x £ uses(s)  then  'R(s)(cr{a/ x})  = TZ(s)(a). 
If  x £ uses (6)  then  W(&)(<7{a/x})  = W(6)(cr). 
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Proof:  By  induction  on  the  complexity  of  s and  b.  Since  the  operations  are 
assumed  to  preserve  their  meaning  in  any  variant,  only  the  basis  steps  are  shown 
here. 


Case  1:  s = y,y  ^ x 
n(y)((r{a/x})  = 'R(y)(a) 

Case  2:  s = m 
' Tl(m)(a{a/x })  = 

= 7 Z(m)(a) 

Case  3:  b = true 
W(true)(<7{a/x})  = T 
= W(true)(<7) 

Case  4:  b = false 
W(false)(<T{o;/x})  = F 
= W(false)(<j) 

□ 


(Sem.  of  state  variant[3.6.2]) 


(Def.  of  constants  [3.5.1],  where  m = c*i) 
(Def.  of  constants  [3.5.1]) 


(Def.  of  true  [3.5.2]) 
(Def.  of  true  [3.5.2]) 


(Def.  of  false  [3.5.2]) 
(Def.  of  false  [3.5.2]) 


When  actually  performing  transformations  on  code,  sets  and  uses  (or  more  likely, 
their  static  equivalents,  discussed  in  Section  3.9)  will  be  computed  to  determine 
whether  a particular  transformation  is  valid.  Knowing  that  the  intersection  of  the 
set  of  all  variables  set  by  one  statement  with  the  set  of  all  variables  used  by  an- 
other statement  is  empty  provides  additional  information  about  the  statements,  as 
described  in  the  lemmas  below.  If  the  sets  of  an  arbitrary  statement  and  the  uses 
of  an  assignment  statement  have  an  empty  intersection,  then  the  value  and  location 
of  the  left-hand  side  of  the  assignment  and  the  value  of  the  right-hand  side  of  the 
assignment  will  not  change  as  a result  of  executing  the  statement. 

Lemma  3.8.2  ( Emvtv  intersections  of  sets  and  uses  ) 

If  sets  (M  (S))  fl  uses(W(  (v  :=  s))  = <j>  and  v :=  s is  not  nullable , then 

• £(i>)(<t)  = £(i>)(.W(S)0)) 

. K(s)(r)  = K(s)(M(S)(c)) 
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For  the  proof  of  this,  assume  M(S)(a)  = a{axl^}  . . . {«„/£„}  where  <r(&)  ^ 
(so  that  the  are  only  the  values  set  by  A4(5)). 


Proof  of  the  first  part: 

Case  1:  v = x 6 Svar 
£(*)(»)  = £«(A<(S)(<t)) 

Case  2:  v = a[s]  G Avar 
£(a[s])(<r)  =<  a,  7?.(s)(<r)  > 

= < a,^(s)(<r{Q!1/^1})  > 

= < a,7l(s)(<T{a1/^1}...{Q;n/^n})  > 
= < a,ft(s)(At(S)(<7))  > 

= £(«M)(A4(S)(*)) 

Proof  of  the  second  part: 

K(v)(<t ) = Tl(v)(a{ai/^}) 

= KWaiai/h}  . . . {an/Zn}) 

= H(v)(M(S)(a)) 


a; 


(Def.  of  location  of  Svar  [3.2.1]) 

(Def.  of  location  of  Avar  [3.2.1]) 

(Def.  of  sets  and  uses  [3.8.1]) 
(Repeated  def.  of  sets  and  uses  [3.8.1]) 
(Given) 

(Def.  of  location  of  Avar  [3.2.1]) 

(Def.  of  sets  and  uses  [3.8.1]) 
(Repeated  def.  of  sets  and  uses  [3.8.1]) 
(Given) 


The  proof  of  the  third  part  is  almost  identical  to  this. 

□ 


Notice  that  the  previous  lemma  requires  that  the  statements  in  question  not  be 
nullable.  If  they  were,  this  result  may  not  apply.  Consider  the  statement  x :=  x. 
Since  both  uses  and  sets  of  A4(x  :=  a:)  are  empty,  the  intersection  of  them  with 
uses  or  sets  of  any  other  statement  is  also  empty.  Therefore,  the  intersection  of  uses 
(M(x  :=  x))  and  sets  (M(x  :=  7))  is  empty,  but  it  is  certainly  not  the  case  that 
7l(x)(cr)  = 7 l(x)(M(x  : —7)(a)).  Counter-examples  to  the  other  two  results  are  also 
readily  available. 

This  result  can  be  applied  to  the  intersection  of  the  set  sets  and  the  set  uses  of 
an  arbitrary  expression. 


Lemma  3.8.3  (Emvtv  intersections  of  sets  and  uses) 
Ifsets(MfS))  n uses  (a)  = <f>  then  H(a){<r)  = n{s){M{S)(a)). 
If  sets (M  (S))  n uses  (b)  = <f>  then  W(6)(a)  = W(b)(M(S)(a)). 


41 


The  proof  of  this  lemma  is  similar  to  that  of  Lemma  3.8.2. 

If  the  sets  of  an  assignment  statement  and  the  uses  of  an  arbitrary  statement  have 
an  empty  intersection,  the  meaning  of  the  arbitrary  statement  is  the  same,  whether 
or  not  the  assignment  statement  has  been  executed.  As  with  all  cases  in  the  definition 
of  the  meaning  of  a statement  in  a modified  state,  the  modification  must  be  undone 
after  evaluating  the  statement. 

Lemma  3.8.1  (Emvtv  intersections  of  sets  and  uses! 

If  sets (Ai(v  :=  s))  fl  uses(Ad  (S))  = f>  and  v :=  s is  not  nullable,  then 

Ad(5)(<r)  = (A<(5)(<7{^(a)(<r)/£(v)(<T)}){^(t;)(<T)/£(t;)((r)} 

This  proof  follows  by  mathematical  induction  on  the  complexity  of  5.  The  basis 
case  when  S is  D follows  directly  from  the  meaning  of  D (in  Definition  3.5.3).  When 
S is  an  assignment  statement,  the  basis  case  comes  from  Lemma  3.8.2. 

In  the  most  general  case,  if  the  intersection  of  the  sets  of  an  arbitrary  statement, 
5i,  and  the  uses  of  another  arbitrary  statement,  S2,  is  empty,  then  the  locations 
assigned  by  S2  and  the  values  assigned  to  those  locations  are  the  same  in  the  state  a 
and  A4(5x)(cr).  This  is  clearly  not  reciprocal.  If  the  intersection  of  sets(Si)  and  uses 
(S2)  is  empty,  then  it  is  possible  that  the  locations  or  the  values  assigned  by  5i  may 
be  different  in  a and  M(S2)(cr).  Consider  if  Si  were  x :=  y and  S2  were  y :=  15. 
Here,  sets(Si)  is  [x]  and  uses(S2)  is  [?/].  The  effective  meaning  of  S2  in  any  state, 
adding  a variant  with  15  in  place  of  y,  is  the  same  regardless  of  whether  Si  has  been 
executed.  However,  if  S2  is  not  executed  before  Si,  Si  yields  a variant  where  a(y)  is 
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assigned  to  x,  whereas,  if  S2  is  executed  first,  15  is  assigned  to  y.  This  is  expressed 
formally  in  the  following  lemma. 

Lemma  3.8.5  ( Empty  intersections  of  sets  and  uses  ! 

If  sets^A4(5i)j  fl  uses(0\d(S2)j  = <f>  then  given  a where  A4(5i)(cr)  = 

(T{all/ 6ll}  • . . {<* lm/ 6m}  0,nd  M{S2)(<t)  = a {0:21/621}  . . . {c*2n/62n}  and  cr(fij)  ^ 0,-j, 
then 

• viVj(6i,-  ^ 62  j) 

• M(S2)(M(Si)(cr))  = (a{a2 1/621}  • • • {o:2n/62n}){o!ll/6ll}  • • • {c*lm/6lm} 

Proving  that  the  6«j  are  not  equal  uses  a simple  contradiction  from  the  definitions 
of  sets,  uses , and  the  meaning  of  a statement.  The  proof  of  the  second  part  follows  by 
mathematical  induction  on  the  complexity  of  S2.  The  basis  case  when  S'  is  D follows 
directly  from  the  meaning  of  D (in  Definition  3.5.3).  When  S is  an  assignment 
statement,  the  basis  case  comes  from  Lemma  3.8.2. 

Notice  that  since  the  61*  and  the  62 j are  not  equal,  Lemma  3.6.1  can  be  used  to 
interchange  the  state  variants,  giving,  for  example 

A4(S2)(A4(5i)(ct))  = (<7-{an/6ll}  . . . {o:im/6lm }) {<^21/621 } • • • {«2n/62n}- 

3.9  Static  Approximation  of  Sets  and  Uses 

The  definitions  of  sets  and  uses  only  consider  dynamic  cases  where  the  state’s  value 
is  known  in  advance.  To  actually  perform  transformations,  however,  it  is  necessary 
to  evaluate  the  sets  and  uses  of  a statement  or  expression  statically  with  little,  if  any, 
information  about  the  state  in  which  the  statement  or  expression  is  being  evaluated. 
The  function  ivar  is  used  as  a static  approximation  of  sets. 
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Definition  3.9.1  (ivar) 

The  set  of  all  integer  variables  occurring  in  s,  b,  or  S is  denoted  by  ivar(s),  ivar(b) 

or  ivar(S),  respectively,  and  consists  of  the  following: 

ivar(x)  = {z}, 

ivar(afsf)  = {a}  U ivar(s), 

ivar(m)  = <j> 

ivarfsi  @s2)  = ivar(si)  U ivar(s2), 

ivar( if  b then  Sj  else  s2  fi.)  = ivar(b)  U ivar(s\)  U ivar(s2), 
ivar  (true)  = <f> 
ivar(f  alse,)  = <f> 

ivar(s\% s2)  = ivar(si)  U ivar(s2), 
ivar(-ib)  = ivar(b) 
ivar(v  :=  s)  = ivar(v)  U ivar(s), 
ivar(Sx-,  S2)  = ivar(Si)  U ivar(S2), 

ivar( if  b then  Sx  else  S2  fi)  = ivar(B)  U ivar(S x)  U ivar(S2), 
ivar(D)  = 0,  and 

ivar(foT  x :=  s\  to  s2  do  S od ) = ivar(s\)  U ivar(s2)  U ivar(S). 

Lemma  3.9.1 

if  x ivar(S)  then  x £ uses(A4(5)) 


This  proof  follows  by  induction  on  complexity  of  S. 
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The  function  livar,  defined  only  for  statements,  describes  the  set  of  variables  on 
the  left-hand  side  of  assignments  and  can  be  used  statically  in  place  of  uses. 

Definition  3.9.2  (livar) 

The  set  of  all  integer  variables  on  the  left-hand-side  of  an  assignment  in  S is  denoted 

by  livar(S)  and  consists  of  the  following: 

livar (x  :=  s)  = {a;}, 

livar (a[s]  :=  s)  = {a}, 

livar(S\]  S2)  = livar(Si)  U livar(S2), 

livar  (if  b then  Sj  else  S2  fi)  = livar(Sx)  U livar(S2), 

livar (D)  = <f>,  and 

livar  (for  x :=  sx  to  s2  do  S odj  = livar(S). 

Lemma  3.9.2 

if  x £ livar(S)  then  x ^ sets(At(5)) 

This  proof  follows  by  induction  on  complexity  of  S. 

While  these  definitions  provide  a close  approximation  of  the  simple  variables  in 
sets  and  uses,  they  tend  to  be  overly  broad  with  array  references,  adding  the  entire 
array  to  the  set,  instead  of  only  the  element  being  accessed.  For  example,  ivar 
(x  :=  3 * w + z)  is  [x,w,z],  but  ivar  (a[x]  :=  3 * it;  + z)  is  [ a,x,w,z ].  Since  array 
references  are  so  important  and  prevalent  in  image  processing,  a finer  approximation 
of  sets  and  uses  is  employed  in  the  actual  implementation  of  these  transformations. 
This  approximation  is  discussed  in  detail  in  Section  6.3. 

For  simple  variables,  the  sets  uses(M(S))  and  ivar(S)  (as  well  as  the  sets  livar(S) 
and  sets(yU(S)))  may  appear  to  be  equivalent,  and  very  often  do  refer  to  the  same 
set.  They  are  not,  however,  always  identical.  As  a counter-example,  consider  the 
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statement  y :=  x — x.  Here,  the  set  ivar(S)  is  [x,y],  those  variables  which  appear 
in  the  statement.  But  the  set  nses(^Vt(5))  is  [j/].  It  does  not  include  x because 
■^(‘5')(cr{Q:/;r})  = (5’)(cr)){or/a:}  for  all  cr  G £ and  all  a G V.  Using  ivar  as  an 

approximation  of  uses  (and  livar  as  an  approximation  of  sets)  will  often  cause  no 
problem,  but  it  may  prevent  some  transformations  from  taking  place.  For  instance, 
if  ivar(S)  is  employed  instead  of  uses(A4(5))  in  the  cases  for  statement  interchange 
(in  Theorem  4.1.1),  the  statements  x :=  7;  y :=  x — x are  not  interchangeable. 

The  sets  ivar  and  livar  usually  give  a reasonable  approximation  of  always  separate. 

Lemma  3.9.3  ( Static  avvroximation  of  Always  Separate ) 

If  ivar(S\)  and  livar(S2)  are  disjoint,  the  elements  of  uses  ( M (Si))  and  sets(0V(  (S2) ) 
are  always  separate. 

Proof: 

Let  iuar(Si)  n livar(S2)  = <f>  and  v G uses(A4(5!))  and  sets  ( M(S2 )) 

By  lemma  3.9.2,  since  v G uses(A4(Si)),  v G ivar(Si) 

By  lemma  3.9.1,  since  v G sets(M(S2 )),  v G livar(S2 ) 

Therefore,  v G ivar(Si)  n livar(S2). 

By  contradiction,  there  can  be  no  v such  that  v G uses(A4(Si))  and  sets(M(S2)), 
so  they  are  always  separate. 

□ 


CHAPTER  4 

PRIMITIVE  TRANSFORMATIONS 


This  chapter  focuses  on  the  minimal  transformations  that  can  be  performed  on 
programs  in  the  language  defined  in  Chapter  3.  These  transformations  will  be  com- 
bined in  Chapter  5 to  give  global  optimizations.  Since  they  are  less  complex  than  the 
global  optimizations,  primitive  transformations  are  more  easily  proved  correct.  The 
proofs  of  the  correctness  of  each  of  the  minimal  transformations  is  provided  here  as 
well.  The  basic  code  transformations  are: 

• statement  interchange  (Theorem  4.1.1). 

There  are  two  variations  on  interchange: 

- interchange  with  substitution  (Theorem  4.1.4) 

- interchange  with  backward  substitution  (Theorem  4.1.5) 

• statement  compression  (Theorem  4.1.6) 

• movement  of  statements  into  (and  out  of)  if  statements  (Theorem  4.2.1)  and 
a variation  that  uses  substitution  (Theorem  4.2.2) 

• if  then  else  statement  splitting  (Theorem  4.2.3) 

• if  then  else  statement  simplification  (Theorem  4.2.5) 

• loop  rolling  and  unrolling  (Theorems  4.3.1  and  4.3.2) 
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To  move  the  statement  Sk  in  front  of  the  statement  Si, 
it  must  be  interchanged  with  all  of  the  statements  Si, ... , Sk- 1: 


Si; 

Sr 


Sk- 2‘, 

Sk-i ; 


X 


Sr 

Sr, 

Si; 

Sr 

Sr,  Sk ; 

Sk;  ^^Si, 

Sk-2',\><^’Sk‘, 
Sk-,  Sk-r 

Sk-3]  Sk- 3 

Sk- 2',  Sk-2 

Sk- 1; 

Sk-i', 

Sk-r,  Sk-i 

Figure  4.1.  Statement  Interchange  Used  to  Move  a Statement 


4.1  Statement  Transformations 

The  most  basic  of  the  smaller  transformations  is  statement  interchange.  Two 
adjacent  statements  may,  in  some  situations,  be  interchanged  with  no  ultimate  change 
in  meaning  to  the  entire  program.  In  some  computer  architectures  this  may  result  in 
a shorter  running  time,  but  that  is  not  the  goal  of  statement  interchange.  Instead, 
it  will  support  rearranging  code  so  that  other,  more  powerful  optimizations  may 
be  performed.  Statement  interchange  is  often  used  to  move  one  statement  to  the 
beginning  (or  end)  of  a group  of  other  statements,  as  shown  in  figure  4.1.  In  code 
motion,  a standard  loop  optimization  technique  that  removes  invariant  statements 
from  loops,  statement  interchanges  first  move  the  statement  being  removed  from  the 
loop  to  the  beginning  of  that  loop.  Additionally,  many  of  the  other  transformations 
discussed  in  this  section  require  that  statements  be  interchangeable  in  order  for  the 
transformation  to  take  place. 


Definition  A.  1.1  ( Interchangeability) 

Two  statements  S\  and  S2  are  said  to  be  interchangeable  in  state  a if  M.(S\;  S2)(cr) 
= M(S2;  Si) (a). 
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While  there  will  always  be  pathological  cases  in  which  two  seemingly  conflicting 
statements  will  still  be  interchangeable,  a set  of  sufficient  conditions  for  statement 
interchange  are  given  in  theorem  4.1.1. 

Theorem  4.1.1  (Conditions  for  statement  interchange ) 

Two  statements  S\  and  S2  are  interchangeable  if  any  of  the  following  conditions  are 
true: 

a)  Si  = S2 

b)  VyVx  3-y  G sets(A4(Si)),  and  x 6 uses(At(52)),  sep(y,  x); 

VyVx  3-y  G sets(A4  (S2)),  and  x G uses(A4(Si)),  sep(y,  x); 

c)  Si  = v :=  f(v),  S2  = v :=  g(v),  v is  strictly  non-self-referencing,  and 

f(d(v))  = g(f(v)). 

Proof: 

The  proof  of  part  a)  follows  directly  from  the  definition  of  equal  statements  [3.5.4]. 
The  proof  of  part  b)  is  in  Theorem  4.1.2. 

The  proof  of  part  c)  is  in  Theorem  4.1.3. 

Clearly,  if  two  statements  are  identical,  their  order  does  not  matter.  It  is  when 
the  statements  are  different  that  statement  interchange  becomes  interesting.  First, 
two  statements  can  be  interchanged  if  the  locations  set  by  each  are  not  used  by  the 
other.  For  example,  a[x]  :=  3 * y can  be  interchanged  with  z w — y,  because  a, 
which  is  set  by  the  first,  is  not  used  in  computing  w — y;  and  z,  which  is  set  by  the 
second,  is  not  used  in  computing  3 * y. 
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Theorem  A.  1.2  (Statement  interchange) 

\=Si',S2  = S2;Si 
Provided: 

sets(A4(5j))n  uses(A4(S2))  = <f> 
sets(A4(S2))fl  uses(Ad(Si))  = <j> 

Proof: 

Assume  that  ^W(5’1)(<r)  = a{au/£n}  . . . <r{alm/£lm}  and 
that  M(S2)(cr)  = cr{a21/£21}  . . .cr{a2n/Z2n}  where  cr(^xy)  ^ axy. 

M(S1-,S2)(a) 

= M(S2)(M{S1)(<j))  (Def.  of  v :=  s [3.5.3]) 

= (<7{c*ll/6l}  • • • <7{aim/6m}){a2l/6l}  . . . C7 {a2„/6n} 

(Def.  of  empty  sets  and  uses  [3.8.5]) 
= (<7{c*2l/6l}  • • • <^{Q:2n/6n}){Q:il/6l}  • • • ^{<*lm/6m} 

(Interchange  of  state  variants  [3.6.1]) 
= A4(5x)(A4(52)(ct))  (Def.  of  empty  sets  and  uses  [3.8.5]) 

= M(S2]  Si)(<t)  (Def.  of  v s [3.5.3]) 


It  is  not  necessary  that  two  statements  have  empty  intersections  of  their  sets  and 
uses  sets  in  order  to  interchange  them.  If  the  statements  are  assignments  to  the 
same  variable  and  the  expressions  being  assigned  to  the  variable  can  be  composed 
in  either  order  (i.e.  if  f(g(v))  = the  statements  may  also  be  interchanged. 

For  example,  the  statements  x :=  3 * x and  x :=  4 * x may  be  interchanged  because, 
for  integer  expressions,  4 * (3  * x)  = 3 * (4  * x).  These  statements  would  not  be 
interchangeable  under  the  conditions  of  Theorem  4.1.2  because  the  uses  and  the  sets 
of  each  statement  is  identical,  the  set  containing  just  x.  The  statements  x :=  3 * x 
and  x :=  4 + x are  obviously  not  interchangeable,  and  would  not  be  as  a result  of 
this  theorem,  because,  in  general,  4 + (3  * x)  ^ 3 * (4  + x). 
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Theorem  4.1.3  (Interchange  of  assignments  of  functions) 

|=  v :=  f(v)]v  :=g(v)  = v :=  g(v)\  v :=  f(v) 

Provided: 

v is  strictly  non-self-referencing  and  f{g(v))  = g(f{v)) 

Proof:  This  can  be  proved  by  manipulating  the  meaning  of  v :=  f(v);  v :=  g(v). 

M(v  :=  f(v);  v :=  sr(n))(<r) 

= M{v:=  g(v))(M(v  :=  f(v))(a))  (Def.  of  5X;  S2  [3.5.3]) 

= M( v :=  5(u))(o-{^(/(u))(cr)/£(n)(<T)}) 

(Def.  of  v :=  s [3.5.3]) 

(Def.  of  v :=  s [3.5.3]) 

= MWWJW/.CMM}) 

{K(j(«)[/(»)/v])(ff)/£(»)(a{R(/(»))(,7)/£(»)(„)})} 

(Sem.  of  subst.  into  expr.  [Lemma  3.7.1]) 
= (<r{K(f(v))(<T)/C(v)(<,))){H(g(v){f(v)/v})(<T)/C(v)(<,)} 

(Given) 

= & {P-{g(v)[f  (v) / v])(cr) / C(v)(cr)}  (Elimination  of  a variant  [3.6.4]) 

= &{'P-{g(f(v)))(cr)lC{v)((j)}  (Def.  of  substitution  [3.4.1]) 

= cr{7?.(/(5f(t;)))(<T)/£(u)(<T)}  (Given) 

— & {H(f  (y)[g(y) / v])(cr) / C(v){cr)}  (Def.  of  substitution  [3.4.1]) 

= (a{^(U))(<7)/£(U)(<r)}){^(/(U)[^(t,)/t;])(<T)/£(V)(<7)} 

(Introduction  of  a variant  [3.6.4]) 

= (^{^))(^)/£(y)(<r)})W/W)(<T{^(V))(<T)/£(  v)(<r)})fC(v)(a)} 

(Sem.  of  subst.  into  expr.  [Lemma  3.7.1]) 

= Mft(s(u))(<r)/£(e)(<7)}) 

{^(/(•'))(<'{K(j(«))(<r)/£(»)(<r)})/C(t>)(<r{K(!|(»))(<r)/£(»)(1r)})} 

(Given) 

= (M(v  :=  f(v))(<r{K(g(v))(cr)/C(v)(*)}) 

(Def.  of  v :=  s [3.5.3]) 

= {M{v  :=  f(v))(M(v  :=  <7(u))(<r))  (Def.  of  v :=  s [3.5.3]) 

= (M(v  :=  5(1?);  v :=  f(v))(a))  (Def.  of  5X;  S2  [3.5.3]) 


Sometimes  the  conditions  necessary  for  interchange  do  not  exist.  If  the  first 
of  the  pair  of  statements  is  an  assignment  statement,  it  may  still  be  possible  and 
desirable  to  rearrange  the  code  by  interchanging  the  statements  while  making  the 
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same  substitution  that  would  have  been  made  by  the  assignment  statement.  This 
interchange  with  substitution  is  discussed  in  the  following  theorem. 

Theorem  A. l.A  (Interchange  with  substitution ) 
j=x:=s;i>  = S[s/x];x:=s 

Provided: 

uses(Ad(x  :=  s))  D sets(Ad(S))  = <j> 


Proof:  This  can  be  proved  by  manipulation  of  the  meaning  of  x :=  s;  S.  If 
x :=  s is  nullable,  the  result  is  obviously  true  (x  :=  x;5  = S[x/x];x  :=  x),  so  the 
proof  assumes  x :=  s is  not  nullable. 

As  a preliminary  result,  notice  that  7e(s)((Ad(S)(<7{7£(s)(a)/£(x)(cr)})) 
{^(x)(<7)/£(x)(c7)})  = ’^(•s)((<T{7?.(s)(<7)/£(x)(<T)}){7?.(x)((T)/£(x)(cr)  j).  This  is  not 
the  result  which  would  be  expected  as  a result  of  Lemma  3.8.2,  because  rather  than 
evaluating  s in  the  state  with  A i(S)  applied  to  the  entire  variant,  Af(5)  is  applied 
to  just  the  state  a {R(s){(j)  / C(x){a)}  and  the  variant  {ft(x)(a)/£(a:)(<r)}  is  then 
applied  to  the  resulting  state. 

To  see  this,  first  let  M(S){a)  = <r{ai/&}  • • . {«„/(„}.  Then  it  follows: 

^(a)((^(5’)(^W5)(«T)/£(x)(<r)})){7l(x)(a)/£(x)((r)}) 

= ^)(((<r{^(5)(<r)/£(x)(<7)}){a1/6}  . • • K/6, })  {ft(x)(<r)/£(x)(<7)}) 

(From  above) 

(Repeated  application  of  Lemma  3.8.1) 


The  proof  of  interchange  with  substitution  then  follows. 

M(x  :=  s;  5)(<r) 

= M(S)(M(x  :=  s)((t))  (Def.  of  Si;  S2  [3.5.3]) 

= A4(S)(<7{ft(a)(<7)/£(x)(<7)})  (Def.  of  v :=  s [3.5.3]) 

= (A<(S)Hft(5)(<r)/£(x)(<7)})) 

{^•(x)(AI(5)(cr{72.(s)(cr)/ £(x)(cr)})) 

/C(x)(M(S)(cr{7l(s)(a)/ £(x)(cr)})} 

(Sem.  of  state  variants  [3.6.3]) 

{R(i)(<T{R(>)(<r)/£(i)(ff)})/£(I)(A4(S)(ff{R(s)(ff)/£(I)(ff)})}} 

(Def.  of  sets  and  uses  [3.8.1]) 
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= (A*(S)(*Wj)(<7)/£(«)(«t)})) 

{7e(5)(<T)/£(x)(^(5)((<T{7e(5)(a)/£(x)((T)}))} 

(Sem.  of  state  variants  [3.6.2]) 

(Def.  of  location  of  Svar  [3.2.1]) 
= (>l(S)(^{K(s)M/£(x)M})){R(3)(<r{R(i)(<T)/£(x)(a)})/£(i)(^)} 

(Sem.  of  state  variants  [3.6.3]) 

= (,M(S)(a{R(3)(<r)/£(x)(^)})) 

{R(S)((tr{R(s)(^)/£(x)(<7)}){R(x)(<7)/£(x)(<r)})/£(x)(<7)} 

(Introduction  of  a variant  [3.6.4]) 

= (A4(S)HR(s)(<t)/£(x)(,7)})) 

W»)((A<(S)(<r{W(*)(ff)/£(*)(ff)})){R(x)(ff)/£(*)(<x)})/£(*)(<,)} 

(Preliminary  result) 

= ((At(5)(<T{^(a)(cr)/£(x)(<7)})){^(x)(<r)/£(x)(<7)}) 

{^(«)((^(5)(<7{^(s)(<t)/£(x)(<t)})){7J(x)(<t)/£(x)(<t)})/£(x)(<t)} 

(Introduction  of  a variant  [3.6.4]) 
= ((^(5)(<T{^(s)(<7)/£(x)(<T)})){^(x)(<r)/£(x)(a)}) 
{^)((^(5)(a{^(a)(a)/£(x)(a)})){7J(x)(<r)/£(x)(a)}) 
/£(x)((M(5)(a{^(5)(«7)/£(x)(a)})){^(x)(«7)/£(x)((x)})} 

(Def.  of  location  of  Svar  [3.2.1]) 

= M(x  :=  s)((M(S)(<7{H(s)(<r)/C(x)(a)})){ll(x)(<T)/C(x)(<r)}  ) 

(Def.  of  v :=  s [3.5.3]) 

= (A4(x  :=  3)((A4(5[s/x])(cr))  (Subst.  into  stat  [3.7.3]) 

= (A4(S[s/x];x  :=  s)(<r)  (Def.  of  5X;52  [3.5.3]) 


Notice  that  statement  interchange  with  substitution  provides  the  same  result  as 
copy  propagation  when  the  statement  being  interchanged  is  an  assignment  of  the 
form  x :=  v. 


Yet  another  version  of  interchange  with  substitution,  one  in  which  the  second  is 
the  assignment  statement  being  used  for  the  substitution,  can  be  useful,  particularly 
in  backward  copy  statement  propagation.  Interchanging  statements,  propagating  a 
copy  statement  backwards,  rather  than  forwards,  is  discussed  in  the  lemma  below. 


Theorem  4.1.5  ( Interchange  with  backward  substitution) 
f=  ux  :=  s;  u2  :=  x = v2  :=  x;  Ui[v2/x]  :=  s[v2/x];  x :=  v2 
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Provided: 

1)  uses(.Vf(u1  :=  5))  n sets(.Vt(u2  :=  x))  = <f> 

2)  V2  is  location-independent  of  v 1# 

3)  V2  is  strictly  non-self-referencing. 

Proof:  This  can  be  proved  by  manipulation  of  the  meaning  of  :=  s;  i>2  :=  x. 
There  will  be  two  separate  cases,  either  = x or  it  is  not.  Again,  the  cases  in  which 
either  V\  :=  s or  v2  :=  x are  nullable  follow  directly  and  are  not  considered  in  the 
proof. 

Case  1:  v\  = x 
M.[y\  :=  s;u2  :=  x)(<r) 

= M(v2  :=  x)(M(Vl  :=  s)(a))  (Def.  of  Sx;  52  [3.5.3]) 

= M(v2  :=  x)(<7 {'R(s)(a)/ (Def.  of  v :=  s [3.5.3]) 

= (<r{7e(S)(Cr)/r(^1)(<T)}) 

{^(x)((r{^)(a)/£(t;1)(^)})//:(t;2)((7{7i(5)(<7)/£(t;1)(<T)})} 

(Def.  of  v :=  s [3.5.3]) 

= (<r{R(s)(<r)/£(x)(<r)}) 

{W(»I)(<T{K(«)(<T)/£(v1)(ff)})/£(u2)(ff{K(J)(ff)/£(Vl)(<r)})} 

(Given) 

= (<T{^(5)(a)/£(x)(<T)}){^(s)(£r)/£(u2)((T{^(s)(<x)/£(u1)(<r)})} 

(Sem.  of  state  variants  [3.6.2]) 

= (ffW*)W/£(*)(ff)}){«(«)(<r)/£(»j)(ff)} 

(Given) 

= (<'M*)(<7)/£(t,2)(<r)}){K(s)(<7)/£(x)(<7)} 

(Interchange  of  state  variants  [3.6.1]) 
= (^{^(5)(<7)/£(v2)(a)}){^(va)((T{^(a)(a)/£(i;2)(<T)})/£(x)(<T)} 

(Sem.  of  state  variants  [3.6.2]) 

= (<r{rc(S)(<T)/£(v2)(<r)}) 

W®2)(ff{K(»)(<r)/£(»j)(iT)})/£(i)(<r{X(»)(<r)/£(«J)(ff)})} 

(Def.  of  location  of  Svar  [3.2.1]) 

= M(x  :=  v2)(<r{ft(s)(<r)/£(v2)(<7)})  (Def.  of  v :=  s [3.5.3]) 

= M(x  :=  »j)((<r{W(i)(ff)/£(i)2)(<T)}){R(a)(<T)/£(»,)(a)}) 

(Introduction  of  a variant  [3.6.4]) 

= M(x  :=  V2) 

((a{TJ(x)(<7)/£(v2)((T)}){^(3)(<7{^(x)(<7)/£(i;2)(<7)})/£(t;2)(<T)}) 

(Given) 

= ■■=  u2)((,7{K(x)(<7)/£(t)j)(<r)}) 

W*)(»{W(*)W/£(»j)(»)})/£(»a)(<T{R(x)(<x)/£(»,)(<x)})}) 

(Given) 

= M(x  :=  v2)(M(v2  :=  5)(ct{^(x)(<t)/£(i;2)(<t)})) 

(Def.  of  v :=  3 [3.5.3]) 
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= M(x  :=  Vl)(M( v2  :=  ■S)(WR(x)(x)/£(x)(x)}){71(x)(x)/£(o2)(,7)}) 

(Sem.  of  state  variants  [3.6.3]) 

= M(x  :=  v2)(M(v2  :=  s)((a{1l(x)(<7)/ C(v2)(a)}){1l(x)((T)/ C(x)(<t)}) 

(Interchange  of  state  variants  [3.6.1]) 

= M(x  :=  v2)(M(v2  :=  s ) 

«»{K(*)W/£(»a)W}){1t(»2)(<r{K(*)(<r)//:(»,)(<r)})/£(x)W}) 

(Sem.  of  state  variants  [3.6.2]) 

= M(x  :=  v2)(M(vi  ■■=  s)((<r{'R(i)((T)/£(t,2)((r)}) 

{V.(v2)(<7{K(x)(v)IC(v2)(a)})IC(x)(a{Tl(x)(c)IC(v2)(<,)})}) 

(Def.  of  location  of  Svar  [3.2.1]) 

= M(x  :=  v2)(M{yx[v2l x\  :=  5[t;2/ar])(cr{7^(x)(<T)/£(u2)(cr)})) 

(Strong  subst.  into  statements  [3.7.4]) 
= M(vx[v2/x ] :=  5[u2/x];x  :=  v2)(c {R(x)(cr) / C(v 2)(cr)})) 

(Def.  of  S\ ; S2  [3.5.3]) 

= M(v i[v2/x]  :=  s[v2/x]\x  :=  v2)(M{v2  :=  x)(<r)) 

(Def.  of  v s [3.5.3]) 

= M(v2  :=  x-,vx[v2/x ] :=  s[u2/x];x  :=  v2)(a) 

(Def.  of  Sx-S2  [3.5.3]) 


Case  2:  vx  ^ x 

As  a preliminary  result,  notice  that  cr{Tl(x)(a)l C(v)(c)}{11(x)((t)I C(x){cj)}  = 
crj 7Z(x)(cr')/£(v)(cr)}.  This  can  be  shown  by  looking  at  the  two  possible  values  of 
C(v)(a).  If  C(v)(c t)  = £(x)(cr),  this  is  just  a case  of  a redundant  variant  and  follows 
from  Lemma  3.6.4.  If  C(v)(cr)  ^ £(x)(cr),  this  follows  from  interchange  of  the  variants 
and  elimination  of  the  {7^(x)(cr)/£(x)(<7)}  variant,  by  Lemma  3.6.3. 

A second  preliminary  result  that  is  used  in  the  following  proof  is  that  7Z(x)(cr)  = 
'R-(x)(cr{7Z(x)((r)/£(v)(cr)}).  If  x = v,  this  follows  by  Lemma  3.6.3.  If  x ^ v,  this  is 
a direct  result  of  Lemma  3.6.2. 

M(vx  :=  s;v2  :=  x)(<r) 

= M(v2  :=  x)(.U(v,  :=  s)(a))  (Def.  of  S,;  S2  [3.5.3]) 

= M(v2  :=  x)(o{R(s)(<7)/£(x, )(<>■)})  (Def.  of  t>  :=  s [3.5.3]) 

= MRMM/Av.Xa)}) 

W*)W«(«)W/£(»i)(ff)})/£(»a)(ffW»)(iT)/£(i>i)(<r)})} 

(Def.  of  v :=  s [3.5.3]) 

= (<7{W(S)(<T)/£(t,1)(o)}) 

{K(x)(<T)/£(t.2)(o{R(s)(<7)/£(o1)(<T)})} 

(Sem.  of  state  variants  [3.6.2]) 

= (<T{K(s)(o)/£(t,I)(<7)}){K(x)(o)/£(t,2)(o)} 

(Given) 

= («x{K(x)M/£(e2)(<T)})fK(s)(a)/£(o1)(<r)} 

(Interchange  of  state  variants  [3.6.1]) 

= (W7J(x)(<,)/£(v2)(o)}){7J(x)(<t)/£(x)(o)}) 

{7^(s)(cr)/£(i;1)(cr)}  (First  preliminary  result) 
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= ((cr{^(x)(<7)/£(t;2)(a)}){7e(5)(<T)/£(l;a)(<T)}) 

{R.(x)(cr) / C(x)(cr)}  (Interchange  of  state  variants  [3.6.1]) 

= (((^W*)(<7)/£(u2)(<7)}) 

{^(x)(<7{7e(x)(<7)/£(l;2)((7)})/£(x)(<r{7l(x)(<7)/£(t;2)(<T)})}) 

{7^(5)(cr)/£(Vl)(Cr)}){^(x)(cx)/£(x)(<x)} 

(Sem.  of  state  variants  [3.6.3]) 

= (((^(R(^)(^)/£(t>2)(<T)}){K(x)(<T)/£(i)(^{7J(x)(<r)/£(«2)(^)})}) 
{rcM(<T)/£(*1)(<7)}){TC(x)(,x)/£(*)(<7)} 

(Second  preliminary  result) 

= (((<r{^(x)(cr)/£(l;2)(<T)}){7e(5)(<7)/£(u1)(cr)}) 

{7?.(x)(<t)/£(x)((t{7?.(x)((t)/£(u2)((t)  })} ) {7?.(x  )(<t)/  £(x)((t)} 

(Interchange  of  state  variants  [3.6.1]) 

= (((<r{^(x)(<r)/£(t,2)(0-)}){^(5)(<r)/£(l;1)(<T)}) 

{^(x)(<7{7e(x)(cr)/£(t;2)((7)})/£(x)(<7{^(x)((T)/£(u2)(<7)})}) 

{1?.(x)(<7)/£(x)((7)}  (Second  preliminary  result) 

= (((<r{^(x)(a)/£(t;2)(a)}){7e(s)(<7)/£(i;1)(<7)}) 

{^(x)(<7{Ti(x)(<x)/£(i;2)(<7)})/£(x)(«T{^(x)(flr)/£(t;2)(<7)})}) 

{7e(u2)(<T{^(x)(<7)/£(u2)(<T)})/£(x)(<r)} 

(Sem.  of  state  variants  [3.6.2]) 

= (((<T{^(x)(a)/£(«3)(<r)}){^(a)(a)/£(w1)(a)}) 

{7e(x)(<T{7l(x)(<T)/£(u2)(a)})/£(x)(<T{^(x)(<7)/£(u2)((x)})}) 

{^(«2)((^{^(x)(<r)/£(t;2)(<T)}){7e(5)(<T)/£(v1)((7)})/£(x)(a)} 

(Sem.  of  state  variants  [3.6.2]) 

= (((<x{TC(x)(<7)/£(r2)(ff)}){U(s)(^)/£(t,1)(<T)}) 

{7J(x)(<T{7J(i)(<,)/£(x)(Cr{K(x)(<T)/£(^)(<r)})})/£(„2)(<,)}) 

{K(x2)(((<r{'re(x)(<T)/£(t,2)(<7)}){R(i)(<x)/£(x1)(<x)}) 

{^(*)((‘'{K(x)(ff)/£(oj)(ff)}){R(a)(,7)/£(«l)(<T)}) 

/£(*)((»W*)(ff)/£(»a)(<r)})W»)(<r)/£(»1)(<T)})})/£(x)(<r)} 

(Sem.  of  state  variants  [3.6.3]) 

= (((^{R(x)(<x)/£(x2)(<x)}){R(s)(<x)/£(x1)(a)}) 

{TC(x)(<x{R(x)(<T)/£(x)(a{R(x)(Cr)/£(t,2)(,T)})})/£(t,!)(<r))) 

{R(x2)(((<x{U(x)(<7)/£(t.2)(<T)}){7lW(^)/£(„1)(<x)}) 

{K(x)(a{K(x)(^)/£(l,2)(<T)}) 

/£(x)((a{K(x)(<T)/£(t,2)(a)})(K(s)(a)/£(^)(»)})})/£(x)(<7)} 

(Sem.  of  state  variants  [3.6.2]) 

= (((<T{7J(x)(a)/£(x2)M}){K(3)(<T)/£(x1)M}) 

{K(x)(<2{R(x)(<T)/£(x)(a(K(x)(<7)/£(t,2)(^)})})/£(x2)(ff)}) 

{TC(x2)(((<7{K(x)(<t)/£(v2)(<t)}){K(s)(»)/£(vi)(^)}) 

{7?(x)(a{TC(x)(x)/£(x2)(ff)}) 

/£(x)(x{TC(x)(x)/£(t»2)(CT)})})/£(x)(<T)} 

(Def.  of  location  of  Svar  [3.2.1]) 
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= (((<*W*)(<0/£(°2)(<')})W»)M/£(u1)(<0}) 

{TC(x)(<T{TC(i)(<T)/£(I)(tr{K(x)(^)/£(„2)(a)})})/£(»2)(^)}) 

{K(I)2)(((^{K(x)(tr)/£(«2)(<r)}){R(s)(«7)/£(x1)(^)}) 

{K(x)(<T{K(x)(<T)/£(o2)(<r)})/£(x)(<T{R(x)(cr)/£(i>2)(,T)})}) 

/£(x)(((^{K(x)(<r)/£(»2)(a)}){K(s)(a)/£(»1)(^)}) 

{K(x)(ff{K(x)((r)/£(x2)(^)})/£(x)(<r{'R(x)(<T)/£(ti2)(o')})})} 

(Def.  of  location  of  Svar  [3.2.1]) 

= M(x  :=  «j)(((<7{R(x)(<r)/£(»2)(ff)}){W(4)(<x)/£(t,1)(<T)}) 
{R(x)(<r{TC(x)(<7)/£(x2)(<7)})/£(x)(^{R(x)(a)/£(x2)(<T)})}) 

(Def.  of  v :=  s [3.5.3]) 

= M{x  :=  v2){{(a{n(x)(a)l C(v2){a)}){n(s)(M(v2  :=  x)(«r))/£(n1)((T)}) 
{7e(x)(<7{7e(x)(<r)/£(U2)(CT)})/£(x)((7{7e(x)((T)/£(n2)(<x)})}) 

(Def.  of  empty  sets  and  uses  [3.8.2]) 

= M(*  :=  V2)(((<7{W(*)(a)/£(U2)(a)}){^(i)(<T{^(x)((r)/£(V2)(a)})/£(t;1)(<7)}) 
{7e(x)(<T{^(x)(<7)/£(n2)(c7)})/£(x)(a{^(x)(CT)/£(t;2)(cT)})}) 

(Def.  of  v :=  s [3.5.3]) 

= M(x  :=  v2)(((a{7l(x)(a)/C(v2)(a)}) 

{7e(5)(<7{^(x)(cr)/£(t,2)((T)})/£(n1)(Ai(n2  :=  x)(<r))}) 
{7e(x)(<T{7e(x)(<r)/£(t;2)(a)})/£(x)(cT{^(x)(<7)/£(n2)(<7)})}) 

(Def.  of  empty  sets  and  uses  [3.8.2]) 

= M(x  '■=  / C(v2)(cr)}) 

{^(i)(a{^(x)(<r)/£(W2)(a)})/£(Wl)(<T{^(x)(iT)/£(i,2)(a)})}) 

{^(x)(a{TJ(x)(<7)/£(i;2)(<7)})/£(x)(<T{7J(x)(<7)/£(i;2)(<r)})}) 

(Def.  of  u :=  s [3.5.3]) 

= M(x  :=  v2){(M(v,  :=  a)(a{R(x)(^)/£(ti2)(ff)}) 

{R(x)(«T{R(x)(CT)/£(»2)(^)})/£(x)(<7{R(x)(a)/£(t,2)(ff)})}) 

(Def.  of  v :=  5 [3.5.3]) 

= Mix  ~ v,)(((M( i>,  :=  <)(ff{R(x)(<r)/£(»a)(<r)}){R(x)(<x)/£(x)(<r)}) 
{R(x)(^{R(x)(ff)/£(v2)(^)})/£(x)(^{R(x)(<x)/£(v2)(<T)})}) 

(First  preliminary  result) 

= M(x  :=  u2)(((X(ui  :=  s)(<t{TI{x){<t)I  C{y2){a)}) 

{K(v2)(<t{TI(x)((t)/ C(v2)(<t)})/ C(x)(cr)}) 

{^(x)(cr{^(x)(tr)/£(u2)(a)})/£(x)((r{7e(x)(a)/£(t;2)(<T)})}) 

(Sem.  of  state  variants  [3.6.2]) 

= M(x  :=  v2)(((M(v,  :=  i)(<7(R(x)(<x)/£(i;2)(<r)}) 

{R(v2)(<x{R(x)(<x)/£(x2)(CT)})/£(x)HR(x)(<x)/£(t,2)(<7)})}) 

{R(x)(<x{R(x)(a)/£(v2)(^)})/£(x)(ff{R(x)(a)/£(t,2)(<r)))}) 

(Def.  of  location  of  Svar  [3.2.1]) 

= M(x  :=  u2)(A<(ui[u2/x]  :=  s[u2/x])(a{7£(x)(cr)/£(i;2)(<T)}) 

(Strong  subst.  into  statements  [3.7.4]) 

= M{v\ [v2/x]  :=  s[v2/ x];  x :=  v2)(<7{7J(x)(<r)/£(t;2)(<7)}) 

(Def.  of  5i;52  [3.5.3]) 
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= M(vx[v2Ix\  :=  s[v2/ *];  x :=  v2)(M(v2  :=  x)(a)) 

(Def.  of  v :=  s [3.5.3]) 

= M(v2  :=  x-,vx[v2lx]  :=  s[u2/x];x  :=  v2)(cr) 

(Def.  of  5i;S2  [3.5.3]) 

□ 


Compression  is  used  to  remove  statements  from  the  code  when  they  are  no  longer 
needed.  A pair  of  statements  will  compress  down  to  the  second  statement  if  the 
meaning  of  the  second  is  the  same  as  the  meaning  of  the  pair.  (This  may  be  combined 
with  statement  interchange  to  provide  compression  to  the  first  statement.) 

Definition  i.  1.2  (Comvressible  statements ) 

A pair  of  statements  Sx;  S2  is  compressible  (to  S2)  in  state  a if  M (Sx ; S2) (a ) = 
M (S2)(a). 

The  simplest  cases  for  statement  compression  follow  from  the  conditions  below. 
More  complex  compression  can  take  place  by  first  applying  the  other  transformations 
(such  as  absorption  into  if  statements)  to  create  these  conditions.  This  will  be 
discussed  in  more  detail  in  chapter  5. 

Theorem  4.1.6  (Compression  of  statements) 

The  following  are  sufficient  for  the  compression  of  Si ; S2  : 

a)  Si  = v si  and  S2  = v :=  s2  when  sets(v  :=  si)  fl  usesfv  :=  s2)  = <f>. 

b)  Si  is  nullable. 

Proof: 

The  proof  of  part  b)  follows  directly  from  the  definition  of  nullable  statements.  The 
proof  of  part  a)  is  provided  below. 

M(v  sx;  v :=  s2)(cr) 

= M(v  :=  s2)(M(v  :=  si){a))  (Def.  of  Sl5  52  [3.5.3]) 

= {M{v  :=  ■s1)(cr)){7^(s2)(yVf(u  :=  si)(<r))/£(u)(A/f(t;  :=  ^(cr))} 

(Def.  of  v :=  s [3.5.3]) 

= (M(v  :=  •Si)(<r)){7^(s2)(cr)/£(u)(cr)}  (Def.  of  empty  sets  and  uses  [3.8.2]) 


58 


= (<r{ft(si)(<x)/£(  v)(a)}){H(s2)(a)/C(v)(a)} 


= <t{H(s2)(<t)/C(v)(v)} 
= M(v  :=  a2)(<r) 


(Def.  of  v :=  s [3.5.3]) 

(Elimination  of  state  variant  [3.6.4]) 
(Def.  of  v :=  s [3.5.3]) 


□ 


4.2  Primitive  if  Statement  Transformations 


There  are  a variety  of  low-level  if  statement  transformations.  The  first,  ab- 
sorption of  statements  into  if  statements,  moves  statements  from  either  before  or 
after  the  if  statement  into  both  the  then  and  else  clauses  of  the  if  statement  (or 
conversely,  moves  statements  out  of  those  clauses).  Bottom  factoring  is  a widely  rec- 
ognized method  of  if  statement  simplification.  Statements  following  an  if  statement 
may  always  be  moved  into  the  clauses.  The  case  for  top  factoring  is  somewhat  more 
complex.  If  absorption  is  discussed  in  the  following  theorems. 

Theorem  4.2.1  (Absorytion  of  statements  into  if  statements ) 
f=  S;  if  b then  Si  else  S2  f i = 
if  b then  S;Si  else  S;S2  fi; 

Provided: 

sets  fM  fSJ)  fl  uses  (b)  = <f>. 
and 

N if  ^ then  Si  else  S2  fi;  S — if  b then  Si;S;  else  S2;S  fi 

Proof  of  the  first  part: 

A4(S;  if  b then  Si  else  S2  f i)(cr) 

= A4(if  b then  Si  else  S2  fi)  (A4(S)(cr)) 


(Def.  of  Si;  S2  [3.5.3]) 


= if\V(b)(M{S){*))  then  M{Si)(M{S)(a)) 

else  M(S2)(M(S)(<t))  fi  (Def.  of  if  [3.5.3]) 


= if\V(b)(a)  then  A4(Si)(A4(S)(cr)) 
else  M(S2)(M(S)(<r))  fi 


(Def.  of  empty  sets  and  uses  [3.8.3]) 
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= */W(6)(<r)  then  M(S;Si)(<t)  else  M(S;S2)(<r)  fi 

(Def.  of  5i;S2  [3.5.3]) 

= M( if  6 then  5;  Si  else  S;  S2  f i)(<r)  (Def.  of  if  [3.5.3]) 


Proof  of  the  second  part: 

•Ad (if  b then  Si  else  S2  fi;  S)(cr) 

= A^(S)(-M(if  b then  Si  else  S2  fi)(cr)) 

(Def.  of  Si;S2  [3.5.3]) 

= M(S)(ifW(b)(<r)  then  M(Si)(a)  else  M(S2)(<t)  fi) 

(Def.  of  if  [3.5.3]) 

= ifW(b)(a)  then  M{S){M{Si)(a))  else  M(S)(M(S2)(a))  fi 

(Moving  M(S ) into  both  branches) 

= (/’VV(6)(<t)  then  M(Si;S)(cr) 

else  M(S2;S)(<t ) fi  (Def.  of  Sx;  S2  [3.5.3]) 

= M( if  b then  Si;S  else  S2;S  f i)(<r)  (Def.  of  if  [3.5.3]) 


In  some  cases  it  may  be  desirable  to  move  statements  from  before  an  if  statement 
into  the  if  statement  even  if  those  statements  possibly  change  the  meaning  of  the 
condition.  Any  assignment  statement  which  assigns  to  a simple  variable  can  be  moved 
into  an  if  statement,  if  there  is  substitution  performed  in  the  condition. 

Theorem  1.2.2  ( Absorption  of  statements  into  if  statements  with  substitution ) 

[=  x :=  s;  if  b then  S\  else  S2  fi  = if  b[s/x]  then  x :=  s;  Si  else  x := 
s;  S2  f i 

Proof:  This  can  be  proved  by  manipulation  of  the  meaning  of  x :=  s;  if  b then 
Si  else  S2  f i. 

M(x  :=  s;  if  b then  Si  else  S2  fi)  (cr) 

= M (if  6 then  Si  else  S2  f i)  (M(x  :=  s)(cr)) 

(Def.  of  Si;S2  [3.5.3]) 

= M( if  b then  Si  else  S2  fi)  [a {Tl{s){a) / C{x){a)}) 

(Def.  of  v :=  s [3.5.3]) 

= ifW{b){a{K{a){a)lC(x)(a)})  then  M(Si){o {H(s){<j) / C{x){<r)}) 
else  M(S2)  (<r{R(s)(cr)lC(x){<j)})  fi  (Def.  of  if  [3.5.3]) 


60 


= ifW(b[s/x])(<r)  then  M{S1){a{Tl{s){a)/ C(x){a)}) 

else  ,M(52)  (cr{^-('S)(<T)/£(a;)(cr)})  fi  (Lemma  3.7.1) 

= if\V(b[s/x])(a)  then  M(S1)(M{x  :=  s)(<r)) 

else  M(S2)(M(x  :=  s)(cr))  fi  (Def.  of  v :=  s [3.5.3]) 

= ifW(b[s/x])(cr)  then  M(x  :=  s;5i)(cr) 

else  M(x  :=  s;  S2)(cr)  fi  (Def.  of  5X;  S2  [3.5.3]) 

= M( if  6[s/x]  then  x :=  s;  Si  else  x :=  5;  S2  fi)  (cr) 

(Def.  of  if  [3.5.3]) 

□ 

In  some  cases,  the  else  clause  of  an  if  then  else  statement  is  empty,  resulting 
in  an  if  then  statement.  In  order  to  simplify  notation,  both  the  if  then  statement 
and  max  and  min  functions  are  defined  below. 

Definition  A. 2.1  fmax  and  min) 
max(m,  n)  = if  m > n then  m else  n fi 
min(m,  n)  = if  m < n then  m else  n fi 

Definition  4.2.2  fit  then  statement ) 

if  b then  Si  f i = if  b then  Si  else  D f i 

Since  later  optimizations  only  work  with  if  then  statements  (as  opposed  to  if 
then  else  statements),  it  may  be  necessary  to  transform  an  if  then  else  state- 
ment into  two  if  then  statements.  This  is  done  in  the  following  theorem. 

Theorem  i.2.3  (Splitting  if  then  else  statements) 

(=if  b then  Si  else  S2  f i = if  6 then  Si  fi;  if  ->f>  then  S2  f i 
Provided: 

sets(0Vf (5i)J  n usesf&J  = (f>. 


Proof: 

M(if  b then  5X  else  52  f i)(<r) 

= */W(6)(<7)  then  M(SX)( ex)  e/se  ,Vf(52)(CT)  fi 

. , (Def.  of  if  [3.5.3]) 

- if^W{b)(cr)  then  M(S2)(a)  else  M^ficr)  fi 

•n.v  ....  , (Meaning  of  if) 

= then  M{S2)(cr)  else  M(S1)(a)  fi 

......  ...  . (Def.  of  W(->6)  [3.5.2]) 

- >fW(^b)(a)  then  M(Si)(M(D)(a))  else  M(D)(M(S1)(a))  fi 

...  . . (Def.  of  D [3.5.3)) 

- if  W(~b)(cr)  then 

, ^K^OM)  else  M(S^)(M(D)(c))  fi 

else  if  W(b)(a)  then  M(D)(M{S1)(a)) 

■rfu  fifi  (Vacuously  true) 

= */W(-.6)(<r)  then  ’ 

M{S2)(ifYv\b)(a)  then  M(S1)(a)  else  M(D)(cr ) fi) 

else  M(D)(ifW(b)(a)  then  M(Si)(a)  else  M(D)(a)  fi)  fi 

(Removing  M(S2)  from  both  of  the 
branches  in  the  first  clause  and 
M(D)  from  both  branches 

= */ W(-if>)(cr)  then  ’ 

M(S2)(M(if  b then  Si  else  D fi)(cr)) 
else  b then  Si  else  D fi)(<r)) 

- vi mi  / \ , (Def-  of  if  [3.5.3]) 

- ifyV(~^b)(a)  then  1 

'^(‘S,2)(eM(if  b then  Sx  fi)(cr)) 
else  if  b then  Sx  fi)(<j))  fi 

- ■{'iah  t\/  * j/.  , (Def-  of  if  then  fi  [4.2.2]) 

- ifW(-<b)(M( if  b then  5X  fi)(cr))  then 

M(S2)(M( if  b then  Si  f i)(cr)) 
else  b then  Si  fi)(a))  fi 

- M(if  h +>,  o ■,  rh  \ / (Def.  of  empty  sets  and  uses  [3.8.3]) 

— -io  then  52  else  D fi)  (A4(if  b then  5X  fi)(cr)) 

. ...  , (Def.  of  if  [3.5.3]) 

= M(lf  ^ then  ^ fi)  (M(if  b then  Si'fi^a)) 

„ (Def.  of  if  then  fi  [4.2.21) 

- Ml  if  A C.  1 u , . w x l 
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Corollary  1.2. i ( Splitting  if  then  else  Statements) 

[=if  b then  S\  else  S2  fi  = if  then  52  fi;  if  b then  5i  fi 
Provided: 

sets(0Vf  (S2))  D usesf&J  = <j> 

This  is  proved  in  the  same  way  as  Theorem  4.2.3. 

Finally,  some  if  statements  can  be  simplified  to  single  simpler  statements.  If  the 
truth  value  of  a condition  is  the  same  in  all  states,  an  if  statement  can  be  simplified 
to  either  the  then  or  else  clause.  If  both  the  then  and  the  else  clauses  of  an  if 
statement  contain  nullable  statements,  then  the  if  statement  can  become  the  empty 
statement.  This  result  shows  an  interesting  result  of  the  fact  that  no  effects  of  errors 
are  considered  during  evaluation  of  expressions.  Consider  the  statement  if  x/0  = 4 
then  D else  D fi.  If  this  were  reduced  to  simply  D,  the  meaning  of  the  code 
would  be  vastly  different  the  first  case  would  cause  a run  time  error  and  the  second 
would  have  no  such  error.  The  third  clause,  combined  with  if  statement  extraction 
(Lemma  4.2.1),  allow  the  simplification  of  statements  of  the  form  if  b then  5 else 
5 fi. 

Theorem  l. 2.5  (if  simplification ) 

\=  (it  b then  Si  else  S2  fi)  = 5j 

Provided:  W(6)(cr)  = T for  all  a 6 S 

(=  (if  b then  5j  else  S2  fi)  = S2 

Provided:  W(6)(<r)  = F for  all  cr  G S * ' ' ‘ - 

f=  (if  b then  S\  else  S2  fi)  = D 

Provided:  Si  and  S2  are  nullable. 
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Proof  of  the  first  part: 

Af(if  b then  Sx  else  S2  f i)(cr) 

= if\V(b)(a)  then  At(5x)(cr)  else  M(S2)(a)  fi 

(Def.  of  if  [3.5.3]) 

= At(5i)(cr)  (Given) 


Proof  of  the  second  part: 

A4(if  b then  5X  else  S2  fi )(cr) 

= ifW(b)(cr)  then  ,Ad(51)(cr)  else  M(S2)(<t)  fi 

(Def.  of  if  [3.5.3]) 

= -M(S2)(er)  (Given) 


Proof  of  the  third  part: 

A4(if  b then  5X  else  S2  f i )(<r) 

= W(&)(cr)  then  jVf(5i)(cr)  else  y\d(52)(<7)  fi 

(Def.  of  if  [3.5.3]) 

= ifW(b)(cr ) then  M(D)(a)  else  M(D)(cr)  fi 

(Given) 

= A 4(D)(<t)  (Meaning  of  if) 

□ 


4.3  Primitive  Loop  Transformation 

There  is  only  one  basic  transformation  needed  for  loop  statements  which  execute 
at  least  once,  the  unrolling  of  the  for,  removing  either  the  first  or  last  iteration  of 
the  statement  from  the  loop.  If  the  loop  does  not  execute  at  least  once,  the  only 
transformation  necessary  is  changing  it  into  the  empty  statement.  All  subsequent 
for  statement  transformations  will  be  proved  by  viewing  the  transformation  as  if  it 
unrolls  the  for  statement  entirely,  performs  the  transformation,  and  then  rerolls  the 
loop.  The  most  complicated  case  of  this  is  unrolling  a statement  to  before  the  for 
statement.  This  is  discussed  in  the  second  part  of  Theorem  4.3.2. 


Theorem  A. 3.1  (Loov  elimination) 
M( for  x :=  5i  to  s2  do  5 od )(a)  = 
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= a 

ifK(s !)(*)£  K(s2)(ct) 


This  proof  follows  directly  from  the  definition  of  the  semantics  of  for  statements 
(Definition  3.5.3). 


Theorem  j.3.2  ( Loov  rolling  and  unrolling) 
f=  M (for  x :=  to  s2  do  S od) (a) 

= M(fox  x :=  Si  to  s2  — 1 do  S od;  S[JZ(s2)(cr)/x])(<T) 

= M(SfR(s1)((r)/x];  for  x :=  7l(si)(a)  + 1 to  11{s2)((t)  do  5 od )(cr) 

ifK{si)(<r)<n{s2){<T) 

The  first  two  parts  are  equal  by  the  definition  of  for  statements  (Definition  3.5.3). 
The  proof  of  the  equivalence  of  the  first  and  third  part  is  by  mathematical  induction 
on  the  number  of  times  through  the  loop  (TI(s2)(<t)  - ft(si)(cr)  + 1)  and  follows. 


Basis:  7£(s2)(cr)  - ft(si)(<x)  + 1 = 1 
A4(f or  x :=  ^ to  s2  do  5 od)(<r) 


= A4(f  or  x :=  si  to  s2  — 1 do  5 od;  5[7^(s2)((t)/x])(<t) 

(Def.  of  for  [3.5.3]) 
(Def.  of  for  [3.5.3]) 
(Def.  of  D [3.5.3]) 
(Given) 

(Def.  of  D [3.5.3]) 


= M(D-,S[Jl(s2)((r)lx])(a) 

= M(S[1Z(s2)(a)/x])(a) 

= M(S[n(Sl)(a)/x]){a)_ 

= M{D)(M{S[VJ^)lx}){  a)) 


= A4(f  or  x :=  7^(s1)(tr)  + 1 to  7^(s2)(o-)  do  S od) 

(Def.  of  for  [3.5.3]) 


(A4(5[?e(5l)(a)/x])(<7)) 


= / x\,  for  x :=  7£(si)(cr)  + 1 to  7 £(s2)(er)  do  S od)(<r) 

(Def.  of  Sx-S2  [3.5.3]) 


Induction  step:  Assume  that  for  x :=  sx  to  s2  do  5 od  = 5[7l(si)(cr)/x];  for  x 
:=  7Z(si)(a)  + 1 to  TZ(s2)(ct)  do  S od  whenever  7Z(s2)(cr)  — ^(^^((t)  + 1 = k (k  > 
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Show  it  is  true  when  Tl(s2)(a)  - 7£(sx)(cr)  + 1 = k + 1. 

At(for  x :=  sx  to  s2  do  S od)(cr) 

= Af(for  a:  :=  sx  to  s2  — 1 do  S od;  5[7^(s2)(<t)/x])(ct) 

(Def.  of  for  [3.5.3]) 

= Af(S[7l(sx)(cr)/x];  

for  x :=  7l(sx)(cr)  + 1 to  TZ(s2  - l)(cr)  do  S od;  5[ft(s2)(<7)/x])(<r) 

(Induction  hypothesis) 

= Ad(for  x :=  ft(ax)(< r)  + 1 to  7 Z(s2  - l)(cr)  do  5 od;  S[ft(s2)(a)/xl) 
(A*(5[fc(*x)(<r)/*])(cr)) 

(Def.  of  Sx;52  [3.5.3]) 

= Ad(for  x :=  7^.(sx)((t)  + 1 to  7 ?.(s2)(cr)  — 1 do  S od;  5[7?.(s2)((7)/x]) 
(A*(S[7t(*x)(<r)/x])(<7)) 

(Def.  of  TZ(s  — 1)  [3.5.1]) 

= At(for  x :=  TZ('Jl(s1)(a)  + l)(A/t(S[7*(ax)(<r)/x])(<7)) 
to  n{H(s2)(cr)  - l)(A4(S[7?(sx)(a)/x])(<7))  do  5 od; 
5[7e(7e(s2)(<7))(^(5^(sx)((x)/x])(tT))/x])  (At(5[7^(sx)(<T)/x])(<T)) 

(Def.  of  7 Z{m)  [3.5.1]) 

= At(for  x :=  7e(7e(sx)(<7))(At(5[7e(sx)(<7)/x])((T))  + 1 
to  7^.(7^(s2)(cr))(Af  (5[7^.(sx)(<7)/x])(cr))  - 1 do  5 od; 
5[7?.(72.(s2)(cr))(Af(5[7?.(sx)((T)/x])(<T))/x])  (A4(5[ft(sx)(<7)/x])(<r)) 

(Def.  of  H(s  - 1)  [3.5.1]) 

= At(f or  x :=  7^(7l(3x)(<y))(X(5^(3x)((T)/x1)(<7))  + 1 
to  7e(7^(s2)(<T))(X(5[7l(sx)(t7)/x])(<7))  do  5 od) 

(X(5[7t(jx)((T)/x3)(<T))  (Def.  of  for  [3.5.3]) 

= At (f or  x :=  7e(sx)(cr)  + 1 to  7 Z(s2)(a)  do  S od) 

(At(5[7e(sx)(<T)/x])(a))  (Def.  of  7 Z(m)  [3.5.1]) 

= A4(g[7fox)(g)/x];  

for  x :=  7^(sx)((7)  + 1 to  7l(s2)(cr)  do  S od)  (cr) 

(Def.  of  SX;S2  [3.5.3]) 


CHAPTER  5 

GLOBAL  OPTIMIZATIONS 


The  basic  transformations  discussed  in  Chapter  4 can  be  combined  to  give  tradi- 
tional data  flow  analysis  optimizations  and  some  new  optimizations  as  well.  Most  of 
these  transformations  can  be  viewed  as  unrolling  one  or  more  loops,  interchanging  the 
resulting  unrolled  statements,  possibly  with  some  statement  elimination,  restructur- 
ing, or  simplification  (such  as  statement  compression  or  if  then  else  simplification) 
and  rerolling  of  the  code  into  a loop  again. 

Loop  joining  (Section  5.1)  unrolls  the  statements  in  two  adjacent  loops,  rearranges 
them,  and  rerolls  them  into  a single  loop.  Similarly,  loop  interchanging  (Section  5.2) 
requires  no  elimination  or  restructuring  of  the  unrolled  statements — the  statements 
in  a pair  of  nested  loops  are  simply  unrolled,  rearranged,  and  then  rerolled.  In  code 
motion  (Section  5.3),  after  the  loops  are  unrolled,  the  copies  of  the  statement  being 
removed  from  the  loop  are  moved  to  the  beginning  of  the  unrolled  statements.  These 
statements  are  then  compressed  to  a single  statement  and  the  remaining  statements 
are  rerolled,  giving  a single  statement  followed  by  a loop.  Finally,  loop-conditional 
joining  (Section  5.4)  unrolls  the  loop,  simplifies  the  conditional  statement  which 
makes  up  the  body  of  the  loop  (and  in  doing  may  remove  some  of  the  conditional 
statements),  and  then  rerolls  the  remaining  statements,  which  no  longer  have  the 
original  conditional  clause. 

Besides  describing  each  complex  transformation  in  terms  of  the  minimal  transfor- 
mations, it  is  necessary  to  determine  when  the  complex  transformations  can  occur 
and  when  they  will  improve  the  code.  In  all  but  the  smallest  of  programs,  these 
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transformations  must  be  attempted  in  some  order.  Heuristic  methods  for  combining 
these  transformations  and  the  transformations  presented  in  the  previous  chapter  are 
presented  in  the  context  of  a prototype  optimizer  in  Section  6.4. 

5.1  Loop  Joining 

Loop  joining,  that  is,  the  combining  of  two  (or  more,  by  repeated  application) 
loops  that  are  executed  over  a similar  range  of  loop  boundaries,  is  another  transfor- 
mation which  provides  its  greatest  benefits  in  rearranging  the  grouping  of  statements. 
Loop  joining  has  some  benefits  in  eliminating  the  initialization  of  the  second  loop  and 
consolidating  the  loop  control  variable  increment  and  examination  costs  of  two  loops. 
Loop  joining  (and  the  reverse  operation  of  loop  splitting)  can  be  part  of  more  spec- 
tacular benefits  when  they  are  used  to  rearrange  code  so  that  other  optimizations, 
most  notably  loop-conditional  joining,  can  occur. 

Theorem  5.1.1  (Loov  Joining) 

j=  .Ad  (for  x :=  Si  to  S2  do  S\  od;  for  y :=  si  to  S2  do  S2  od)(a)  = 

Ad( for  x :=  si  to  s2  do  S1;S2[x/y]  od) (cr) 

Provided: 

sets  (Si,)  D uses(sij  = <j> 
sets(Si)  fl  uses(s2^  = <f> 

Vyi,i/2  3-^(>Si)(cr)  < yi  < y2  < 7^(s2)(<t),  Si[j/i/x]  is  interchangeable  with  S2[y2/y] 

Proof:  Proof  by  induction  on  the  number  of  times  through  the  loop,  (7£($2)(<t)  - 

ft(3i)(<7)  + 1). 

Basis:  (72.(s2)(<t)  — 7£(si)(<r)  + 1)  < 0 (no  iterations). 

Ad(for  x :=  to  s2  do  Si  od;  for  y :=  si  to  s2  do  S2  od)(cr) 

= Ad(Z);  for  y :=  si  to  s2  do  S2  od)(cr)  (Loop  unrolling  [4.3.1]) 

= Ad  (for  y :=  sx  to  s2  do  S2  od)(cr)  (Def.  of  D [3.5.3]) 

= Ad(Z))(<r)  (Loop  unrolling  [4.3.1]) 
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= A4(for  x :=  si  to  s2  do  5i;52[x/y]  od)(<r) 

(Loop  rolling  [4.3.1]) 


Induction  step:  Assume  that  for  x :=  sj  to  S2  do  S\  od;  for  y :=  3i  to  32  do  Si 
od  = for  x :=  S\  to  32  do  51;52[r/y]  od;  in  states  where  (jz(s2)(cr)  — '^.(31)((j)  + l) 
= k (k  > 0). 


Show  it  is  true  in  states  where  (1l(s2)(cr)  - 7^(31)(<t)  + 1)  = k + 1. 

A4(for  x :=  sx  to  s2  do  Si  od;  for  y :=  3i  to  32  do  S2  od)(cr) 

= Af  (f or  x :=  sj  to  s2  - 1 do  Si  od;  51[7^(32)(cr)/r]; 
for  y :=  sx  to  s2  - 1 do  S2  od;  ^[T^IX^/y])^) 

(Loop  unrolling  [4.3.2]) 

= Af(for  x :=  si  to  32  - 1 do  Si  od;  for  y si  to  32  - 1 do  S2  od; 

Si[7?.(32)(<7)/x];  ‘S,2['^(-S2)(o’)/j/])(<7')  (Given) 

= At  (for  x :=  3j  to  s2  - 1 do  Si;  S2[x/y]  od; 

‘S'i['^(32)(cr)/x];  »S'2[72.(52)(cr)/ 2/])(cr)  (Induction  hypothesis) 

= At  (for  x :=  s!  to  s2  - 1 do  5i;  S2[x/y]  od; 

S1[71(32)(ct)/x];  S2[x/y][7Z(s2)(a)/x})(cT) 

(Subst.  into  stat.  [3.4.2]) 

= A4(f  or  x :=  3X  to  s2  - 1 do  5X;  S2[x/y]  od; 

('S'i;  *S,2[^/2/])['^2-(-S2)(cr)/a:])(cr)  (Def.  of  subst.  into  stat.  [3.4.4]) 

= Af(f  or  x :=  si  to  s2  do  5X;  52[x/y]  od)(cr) 

(Loop  rolling  [4.3.2]) 


□ 


5.2  Loop  Interchange 

Loop  interchanging  is  a code  transformation  which  may  not  provide  any  immedi- 
ate benefit  to  the  code.  It  is  used  to  change  the  order  of  the  indices  of  two  (or  more, 
by  repeated  application)  loops.  This  can  be  viewed  as  a change  from  row-major  to 
column-major  traversal  of  an  array.  (Alternatively,  this  may  be  viewed  as  transi- 
tion from  column-major  to  row-major — for  the  sake  of  consistency  the  row-major  to 
column-major  interpretation  will  be  used  here.)  There  may  be  a slight  increase  or 
decrease  in  the  speed  of  the  code  because  loop  interchange  will  change  the  number 
of  times  the  inner  loop  will  be  initialized  and  may  change  the  number  of  times  each 
of  the  loop  conditions  will  be  evaluated.  While  this  change  in  loop  overhead  may 
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for  Xi  :=  1 to  1000  do 
for  x2  :=  1 to  2 do 
5 
od 
od 

Outer  loop  initialization  1 

Inner  loop  initialization  1000 

Outer  loop  condition  check  1001 
Inner  loop  condition  check  3000 

Figure  5.1.  Loop  Interchange  May 


for  x2  :=  1 to  2 do 

for  Xi  :=  1 to  1000  do 
S 
od 
od 

Outer  loop  initialization  1 

Inner  loop  initialization  2 

Outer  loop  condition  check  3 
Inner  loop  condition  check  2002 

mge  the  Number  of  Loop  Initializations 


seem  to  be  reason  enough  for  loop  interchanging  in  extreme  conditions  such  as  the 
one  in  Figure  5.1,  it  is  usually  insignificant  when  compared  to  the  real  power  of  loop 
interchanging. 

The  major  advantage  of  loop  joining  is  that  it  allows  other  optimizing  manipu- 
lations involving  loops,  most  notably  loop  condition  joining,  but  to  a lesser  degree 
loop  splitting,  to  occur.  A conditional  statement  may  refer  to  the  loop  control  not 
of  its  immediately  encompassing  for  statement,  but  rather  to  the  outer  for  state- 
ment. This  occurs  in  many  cases  in  the  image  algebra,  and  will  appear  in  an  example 
discussed  in  Chapter  6.  The  loop  interchange  is  crucial  in  allowing  more  powerful 
transformations  to  take  place. 

Figure  5.2  shows  a nested  loop  before  and  after  loop  interchange,  with  the  loops 
unrolled  to  show  the  exact  statement  ordering.  While  it  is  fairly  straightforward 
to  see  the  difference  in  the  code  before  and  after  loop  interchange,  it  is  awkward 
to  express  this  difference  mathematically.  Transition  from  row-major  to  column- 
major  involves  moving  statements  over  large  groups  of  statements.  Figure  5.3  shows 
some  of  this  movement.  To  transform  row-major  to  column-major,  the  statement 
S[ni/x2][mi  + l/xi]  must  be  interchanged  with  all  of  the  statements  above  it  except 
5’[ni/x2][m1/a;1]  (that  is,  all  of  the  statements  with  a lower  row  number  and  a higher 
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The  code  segments: 

(in  row-major  form) 

for  £1  :=  mi  to  m2  do 
for  x2  :=  rii  to  n2  do 
S 

od 

od 

are  equivalent  to  the  following 
S[nilx2][milx  i]; 

S[ni  + l/r2][mi/xi]; 

5[ni  + 2/x2][m1/x1j; 

5[n2/r2][mi/xi]; 

5[n1/x2jjm1  + l/xx]; 

5[n2/x2][m1  + 1/ Xj] ; 

5[n2/x2][m2/xx]; 


(in  column-major  form) 

for  x2  :=  nx  to  n2  do 

for  xi  :=  mx  to  m2  do 
S 

od 

od 

;s  (assuming  mx  < m2  and  nx  < n2): 
S[mi/xi][ni/x2]; 

5[mi  + l/xi][ni/x2]; 

5[mi  + 2/xij[ni/x2j; 

5[m2/xi][ni/x2]; 

5[mi/xij[ni  + l/x2] ; 

5[m2/xi][ni  -f  l/x2]; 

5[m2/x:][n2/x2]; 


Figure  5.2.  The  Effects  of  Loop  Interchange 
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5’[n1/x2][m1/x1]; 

S[ni  + l/x2][mx/xx]; 
5[na  + 2/x2][mx/xxj; 

5[n2/x2][mj/xi]; 
S[nx/x2][mx  + 1/xi]; 

5[n2/x2][mi  + l/*i]; 

5[n2/x2][m2/x!]; 


5[n1/x2][m1/xi]; 
S[nx/x2][mx  + 1/xi]; 


5[ni  4-  l/x2][m1/xi]; 

5[nx  + 2/x2][mi/xi]; 

5[n2/x2][mx/xi]; 

S[nx  + l/x2][m1  + 1/xi]; 

‘S’[n2/x2][m1  + 1/xi]; 
5[ni/x2][mi  + 2/xi];  — 

S[n2/x2][m2/xi]; 


Figure  5.3.  Statement  Interchanging  During  Loop  Interchanging 


column  number).  Then,  the  statement  5[nx/x2][mi  + 2/xx]  must  be  interchanged 
with  all  of  the  statements  above  it  except  the  first  two  (again,  all  statements  except 
those  with  a lower  row  number  and  a lower  column  number).  A statement  in  row  i 
and  column  j must  be  interchanged  with  statements  in  all  previous  rows  which  have 
a higher  column  number. 

As  a preliminary  result  to  loop  interchange,  notice  that  if  there  is  no  body  to  a 
loop,  the  loop  has  no  meaning. 


Lemma  5.2.1  (Loov  simplification) 
for  x ;=  to  s2  do  D od  = D 


The  proof  of  this  follows  directly  from  loop  unrolling. 

With  this  result,  the  proof  of  the  validity  of  loop  interchange  is  fairly  straightfor- 


ward. 
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Theorem  5.2.2  ( Loop  Interchange ) 

H ( for  Xi  :=  s4  to  s2  do  for  x 2 :=  S3  to  54  do  S od  od^  (a)  = 

At  (if  or  X2  ■'=  S3  to  54  do  for  X\  :=  64  to  52  do  S od  od)(a) 

Provided: 

sets(S)  fl  usesf^ij  = <f> 
sets  (S)  fl  uses  (s2)  = <j> 
sets  (S)  fl  uses  (^3 j = <j> 
sets  (S)  fl  uses  (34)  = <j> 
x\  £ ivar  (s3)  = <f> 

Xi  ivar  (s4)  = <f> 

V?/1, 2/2, 2/3, 2/4  3-7?.(si)(cr)  <yi<y2<  Tl(s2)(a),  and  7£(s3)(cr)  < y3  <y  < 7l(s4)(cr), 
S[ys/ x2][y2/ x4]  is  interchangeable  with  S[y4/ x2][y\l Xi\ 

Proof:  Proof  by  induction  on  the  number  of  times  through  the  loop,  (Tlis-iMcr)  — 

ft(si)(<7)  + l).  6 v 

Basis:  (JZ[s2){a)  — T£(sx)(<7)  + 1)  < 0 (no  iterations). 

M.  (for  xx  :=  sx  to  s2  do  for  x2  :=  S3  to  s4  do  5 od  od)(cr) 

= M.(D)(cr)  (Loop  unrolling  [4.3.1]) 

= -M(for  x2  :=  S3  to  S4  do  D od)(<r)  (for  simplification  [5.2.1]) 

= A4(for  x2  :=  s3  to  s4  do  for  xx  :=  sx  to  s2  do  S od  od)(«r) 

(Loop  rolling  [4.3.1]) 


Induction  step:  Assume  that 
for  xx  :=  sx  to  s2  do  for  x2  :=  S3  to  S4  do  S'  od  od  = 

for  x2  :=  S3  to  s4  do  for  xx  :=  sx  to  s2  do  S od  od 
in  states  where  (7Z(s2)((t)  - ll(si)(cr)  + 1)  = k (k  > 0). 


Show  it  is  true  in  states  where  (7Z(s2)(<7)  — 7?.(sx)(cr)  + 1)  = k + 1. 

A4  (for  xx  :=  sx  to  s2  do  for  x2  :=  S3  to  s4  do  S od  od)(<j) 

= M (for  xx  :=  sx  to  s2  — 1 do  for  x2  :=  s3  to  s4  do  S od  od; 
(for  x2  :=  s3  to  s4  do  S od)[R(s2)(a)/xi])(a) 

(Loop  unrolling  [4.3.2]) 
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□ 


— M (for  x2  S3  to  s4  do  for  xx  :=  sx  to  «2  - 1 do  5 od  od; 
(for  x2  :=  53  to  s4  do  5 od)[ft(s2)(<7)/xx])(cr) 

(Induction  hypothesis) 
= M (for  x2  :=  33  to  s4  do  for  xx  :=  sx  to  s2  — 1 do  5 od  od; 


for  x2j=^S3[7?.(s2)((7)/x1]  to  54[72.(32)(cr) /xi]  do 

S[ft(s2)(<7)/xi]  od)(cr)  (Def.  of  subst.  into  stat.  [3.4.4]) 

= A4  (for  x2  :=  33  to  s4  do  for  xx  :=  sx  to  32  — 1 do  5 od  od; 
for  x2  :=  33  to  s4  do  S[ft(32)(cr)/xi]  od  od)(cr) 

(Substitution  simplification  [5.3.2]) 
= -M  (for  x2  :=  s3  to  s4  do  for  xx  :=  sx  to  s2  — 1 do  S od; 

S[7?.(s2)(cr)/xi][x2/x2]  od)  (a)  (Loop  joining  [5.1.1]) 

= M.  (for  x2  :=  33  to  s4  do  for  xx  :=  3X  to  32  - 1 do  5 od; 

S[7£(s2)(cr)/xi]  od)  (cr)  (Def.  of  subst.  [3.4.4]) 

= M.  (for  x2  :=  S3  to  s4  do  for  xx  :=  sx  to  s2  do  S od  od)  (cr) 

(Loop  rolling  [4.3.2]) 


5.3  Code  Motion 

Code  motion  is  a traditional  code  transformation  which  removes  loop  invariant 
statements  from  a loop.  It  can  be  viewed  as  a series  of  smaller  transformations.  First 
the  statement  to  be  removed  is  interchanged  with  the  other  statements  in  the  loop 
until  it  becomes  the  first  statement.  (If  this  cannot  be  done,  the  statement  cannot  be 
removed.)  Then  the  statements  in  the  loop  are  unrolled.  The  statements  which  are 
invariant  (since  the  loop  may  have  been  unrolled  more  than  once,  there  may  be  more 
than  one  copy  of  the  invariant  statement)  are  moved  to  the  first  positions  by  more 
statement  interchange,  leaving  the  other  statements  in  the  same  order  as  before.  The 
remaining  statements  are  rolled  into  the  loop  again  and  the  invariant  statement  is 
compressed  so  there  is  only  one  copy  of  it.  Since  all  of  these  operations  have  already 
been  shown  to  yield  code  with  the  same  meaning  as  the  original  code,  the  overall 
operation  must  also  result  in  equivalent  code. 
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Theorem  5.3.1  (Code  Motion ) 

1=  M (for  x :=  Sx  to  s2  do  Sa;S2  od)(cr)  = Ad  (Si;  for  x :=  sa  to  s2  do  S2  od  )(a) 
Provided: 

setsfSij  fl  uses^sij  = (f> 
sets(Si)  n uses(s2>)  = (f> 

■ft(si)(cr)  < 7Z(s2)(a )/ 
x £ ivar(S\); 

S\  is  interchangeable  with  S2[k/x]  for  k = m . . . n — 1; 

Si;  Si  is  compressible. 

In  proving  this  theorem,  the  following  lemmas  are  very  useful. 

Lemma  5.3.2 

If  x £ ivar(s)  then  s[si/x]  = s.  Similarly,  if  x £ ivar(b)  then  6[si/x]  = b. 

Lemma  5.3.3 

If  x £ ivar(S)  then  S[si/x]  = S. 

The  proofs  of  both  of  these  lemmas  follow  by  mathematical  induction  on  the 
complexity  of  s,  b , and  S.  The  proof  of  code  motion  then  follows  by  induction  on  the 
number  of  times  through  the  loop,  (H(s2)(<r)  - H(si)(a)  + 1). 

Proof: 

Basis:  (1Z(s2)(cr)  — 7l(sx)(cr)  + 1)  = 1. 

Ad  (for  x :=  $i  to  s2  do  Si;  S2  od)(cr) 

= M( for  x :=  s\  to  s2  — 1 do  Si;  S2  od; 

(•Si;  S2)[7^(s2)(cr)/x])(cr)  (Loop  unrolling  [4.3.2]) 

= Ad(D;(Si;  S2)['72.(s2)(<t)/x])(<t)  (Loop  unrolling  [4.3.1]) 

= AT((Si;  S2)[^(s2)(<r)/x])(<r)  (Def.  of  D [3.5.3]) 

= Ad(Si[7^(S2)(a)/x];  S2[ft(a2)(a)/*])(a) 


= Ad(Si;S2[ft(*2)(<7)/*])(<r) 


(Def.  of  subst.  into  stat.  [3.4.4]) 
(Subst.  simplification  [5.3.3]) 


= X(52[7e(3a)(<7)/a;])(A^(51)(<T))  (Def.  of  Sx;  S2  [3.5.3]) 

= M{Si[11(si)((t)Ix\,  D){M{Sx){i t))  (Def.  of  D [3.5.3]) 

= M{S2[n{s2)(a)  / x}\  

for  x :=  n{sx)((r)  + 1 to  71(s2)(<j)  do  S od) 

(Ad(5x)(cr))  (Loop  rolling  [4.3.1]) 

= M(S2[H{s2)(M(S1)((t))/x]; 

for  x :=  ^(j1)(X(51)(<t))  + 1 to  7l(s2)(Al(Sx)(cr))  do  S od) 

(M(Sx)(<r))  (Def.  of  empty  sets  and  uses  [3.8. 

= A4(for  x :=  sx  to  s2  do  S2  od)(A4(Sx)(<r)) 

(Loop  rolling  [4.3.1]) 

= A4(SX;  for  x :=  sx  to  s2  do  S2  od)(cr) 

(Def.  of  5i;52  [3.5.3]) 

Induction  step:  Assume  that 

for  x :=  Si  to  s2  do  Sx;£2  od  = Sx;  for  x :=  sx  to  s2  do  S2  od 
in  states  where  (7 Z(s2)(er)  - 71(s1)(<t)  + 1)  = k (k  > 1). 

Show  it  is  true  in  states  where  (7 Z(s2)(a)  - 7l(si)(a)  + 1)  = k + 1. 

A4(f or  x :=  sx  to  s2  do  SX;S2  od)(<r) 

= M((Si;S2mJ^/x];  

for  x :=  7£(sx)(<t)  + 1 to  7Z(s2)((t)  do  Sx;  S2  od)(a) 

(Loop  unrolling  [4.3.2]) 

= A4(5x[7e(sx)(q)/x];52^(sx)((7)/x]; 

for  x :=  7e(sx)(cr)  + 1 to  H(s2)(<t)  do  5X;  S2  od )(<t) 

(Def.  of  subst.  into  stat.  [3.4.4]) 

= M(S1]  52[7^(sx)((7)/r];  

for  x :=  7£(sx)(cr)  + 1 to  7£(s2)(<t)  do  Sx;  S2  od)(<r) 

(Subst.  simplification  [5.3.31) 

= M(51;52[7l(s1)(cT)/a:];  

5X;  for  x :=  7 ^(sx)(<r)  + 1 to  H(s2)(a)  do  S2  od )(<j) 

(Induction  hypothesis) 

= A4(Sx;Sx;52[7*(sx)(<7)/x]; 

for  x :=  7 £(sx)(<t)  + 1 to  7^(s2)(<t)  do  S2  od)(cr) 

(Statement  interchange  [4.1.1]) 

= M(Si;  S2[R,(s1)(<r)/x];  

for  x :=  7Z(si)(a)  + 1 to  'R{s2){a)  do  S2  od)(u) 

(Statement  compression  [4.1.6]) 

= A4(52[7^(sx)(<t)/x];  

for  x :=  7£(sx)(cr)  + 1 to  7 ?.(s2)(cr)  do  S2  od)(A4(Sx)(cr)) 

(Def.  of  5X;  S2  [3.5.3]) 
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for  x :=  ^(g1)(X(51)(cr))  + 1 

to  7?.(52)(A/((S'i)((t))  do  S2  od)(Af  (5i)(cr)) 

(Def.  of  empty  sets  and  uses  [3.8.3]) 
= A4(f or  x :=  sx  to  s2  do  S2  od)(Ad (Si)(<r)) 

(Loop  rolling  [4.3.2]) 

= A4(Sa;  for  x :=  sx  to  s2  do  S2  od)(cr) 

(Def.  of  5i;52  [3.5.3]) 

□ 

5.4  Loop-Conditional  Joining 

Most  of  the  conventional  optimizations,  while  they  may  increase  the  speed  of 
code  significantly,  do  not  actually  change  the  order  of  complexity  of  the  code.  If 
an  algorithm  is  0(n2)  before  the  optimization,  it  is  probably  still  0(n2)  after  the 
optimization.  (There  are  of  course  some  cases  where  an  optimization  may  completely 
eliminate  a block  of  code,  effectively  reducing  its  running  time  to  0(1).)  Loop- 
conditional  joining  may  change  the  number  of  times  a loop  is  executed  and  in  some 
cases  may  reduce  the  number  of  times  a loop  is  executed  to  a constant  such  as  one 
or  two. 

Loop-conditional  joining  is  applied  to  a loop  where  the  only  statement  in  the  loop 
is  an  if  then  statement  which  compares  the  loop  control  variable  with  an  expression 
which  does  not  depend  on  either  the  loop  control  variable  or  statement  in  the  loop 
for  a value.  If  the  conditional  has  an  else  clause,  the  if  then  else  statement  must 
first  be  split  (using  Theorem  4.2.3),  and  the  loop  split  into  two  loops  containing 
one  if  then  statement  each  (using  Theorem  5.1.1).  The  loop  is  unrolled  (using 
Theorems  4.3.1  and  4.3.2)  and  the  conditional  is  simplified  (using  Theorem  4.2.5). 
This  will  give  a group  of  zero  or  more  statements  which  no  longer  have  the  original 
conditional,  which  are  then  rerolled  into  another  loop  (or  left  as  if  there  are  only  a 
few  statements). 
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Theorem  5.1.1  (Loov-Conditionnl  Joining) 

}=  for  x :=  sx  to  52  do  if  x > s 3 then  S fi  od  = for  x :=  max(si,  s3)  to  s2  do  5 

od 

and 

(=  for  x :=  Si  to  s2  do  if  x < S3  then  5 f i od  = for  x :=  Si  to  min(s2,s3)  do  5 
od 

Provided: 

sets (S)  fl  uses(S3j  = cf> 
x & uses(s3 ) 

Proof:  Proof  of  the  first  part  by  induction  on  the  number  of  times  through  the 
loop,  (7£(s2)(cr)  — 7?.(si)(<t)  -f  1).  The  proof  of  the  second  part  is  nearly  identical. 

Basis:  (7^(s2)(<t)  - 7?.(s1)(o-)  + 1)  < 0 (no  iterations). 

M(fox  x :=  si  to  s2  do  if  x > s3  then  5 fi  od)(cr) 

= Ad(D)(cr)  (Loop  unrolling  [4.3.1]) 

= A4(for  x :=  max  (si,S3)  to  s2  do  if  x > S3  then  S fi  od)(cr) 

(Loop  rolling  [4.3.1]) 


Induction  step:  Assume  that 

for  x :=  si  to  s2  do  if  x > S3  then  5 fi  od  = for  x :=  max(si,S3)  to  s2  do  S od 

in  states  where  (7^(s2)(<t)  - ft(si)(<r)  + 1)  = k (k  > 0). 

Show  it  is  true  in  states  where  (7£(s2)(<r)  - 7J(si)(a)  + 1)  = k + 1. 

•Ad (for  x :=  sx  to  s2  do  if  x > s3  then  S f i od)(cr) 

= A4((if  x > s3  then  5 f i)[7£(si)(cr)/a:]; 
for  x :=  7£(si)(<t)  + 1 to  7 l(s2)(a)  do 

if  x > S3  then  5 fi  od)(<r)  (Loop  unrolling  [4.3.2]) 

= Ad(if  x[R,(si)((t) / x]  > 53[7?.(si)((T)/a:]  then  5[7^(si)(it)/x]  fi; 
for  x :=  7^(si)(cr)  + 1 to  7^(s2)(<t)  do 

if  x > S3  then  S fi  od)(<r)  (Def.  of  subst.  into  stat. [3.4.4]) 

= A4(if  7?.(si)(cr)  > S3  then  5[7^(si)(<r)/x]  fi; 
for  x :=  7£(si)(cr)  + 1 to  7^(s2)(cr)  do 

if  x > S3  then  5 fi  od)(cr)  (Def.  of  subst.  into  exp.  [3.4.1]) 


= M( if  7Z(3i)(<t)  > s 3 then  S'[^(31)(<r)/ar]  fi; 

for  x :=  max  (7Z(si)(a)  + 1,  53)  to  7Z(s2)(cr)  do  S od)(a) 

(Induction  hypothesis) 

= M( for  x :=  max  (7£(si)(cr)  + 1,  s3)  to  H(s2)(a)  do  S od) 

{M(if  n(3l)(a)  > s3  then  S[ft(sx)(<r)/x]  fi)(<r)) 

(Def.  of  5i;52  [3.5.3]) 

= A4(f or  x :=  max  (ft(ax)(<r)  + 1,  a3)  to  n(s2)(<r)  do  S od) 

(if  TZ(lZ(si)(cr))(cr)  > 1Z(s3)(cr)  then  M.(S[H(s\)(cr) / x])(cr) 

else  M(D)(cr ) fi) (Def.  of  if  [3.5.3]) 

= jVf(for  x :=  max  (1Z(si)(cr)  + 1,  a3)  to  1Z(s2)(a)  do  5 od) 

(ifn(Sl)(a)  > 7Z(s3)(a)  then  M(S(n(Sl)(a)/x])(a) 

else  M.(D)(a)  fi)  (Sem.  of  1Z(m)  [3.5.1]) 

= if  7Z(si)(cr ) > 7l(s3)(cr)  then 

M(for  x :=  max  (7£(ax)(<r)  + 1,  53)  to  Tl(s2)(cr)  do  S od) 

(M(S[Jl(s1)((r)/x])(a))  

else  M(for  x :=  max  (1Z(3l)((r)  + 1,  s3)  to  1l(s2)(cr)  do  5 od) 

(^(D)(cr))  fi  (Meaning  of  if) 

= ifH(Sl)(<T)  > 7l(s3)(cr)  then 

M(tor  x :=  max  (TZ(3l)(a)  + 1,  s3)  to  ft(s2)(or)  do  S od) 

(A4(S[ft(*x)(<r)/x])(<r))  

else  Ad  (for  x :=  max  (7l(sx)(<7)  + 1,  s3)  to  ,R(s2)(a)  do  S od)  (a) 

fi (Meaning  of  D [3.5.3]) 

= ifJZ(si)(a)  > 7l(s3)(cr)  then 

M( for  x :=  max  (7l(sx)(cr)  + 1,  s3)  to  H(s2)(a)  do  5 od) 

(A^(5[^(5i)(<t)/x])(<7)) 

else  M(for  x :=  max  (sx  + 1,  s3)  to  s2  do  S od)(<r)  fi 

(Sem.  of  'fc(.s)  [3.5.1]) 

= if'JZ(si)(a)  > K(s3)(<t)  then 

M (for  x :=  max  (7^(s1)(cr)  + 1,  s3)  to  H(s2)(<r)  do  5 od) 
(M(S[n(s1)(a)/x})(a)) 

else  M( for  x :=  max  (s1}  s3)  to  s2  do  S od)  (a)  fi 

(Since  7e(sx)((r)  < ft(s3)(a), 

max  (si,  s3)  = max  (sx  + 1,  S3)) 

= if  Tt(s\)(c)  > ft(s3)(cr)  then 

M(S[R,(si)(cr)  / x]; 

for  x :=  max  (ft(ax)(<r)  + 1,  s3)  to  H(s2)(cr)  do  5 od)(<r) 
else  M( for  x :=  max  (sl5  s3)  to  s2  do  5 od)  (a)  fi 

(Def.  of  Si;S2  [3.5.3]) 

= if  TZ(si)(a)  > 7Z(s3)(a)  then 

M(S[R.(si)(<t)/ x]; 


for  x 7?.(sx)(o')  + 1 to  lZ(s2)(cr)  do  S od)(<r) 
else  M{ for  x :=  max  (s1?  53)  to  s2  do  5 od)  (<7)  fi 

. > K(*3)(<t)) 

if  7Z(s\)(cr)  > 7Z(sz)(cr)  then 

Ad(for  x :=  Si  to  s2  do  S od)(a) 

else  Ad  (for  x :=  max  (si,  s3)  to  s2  do  S od)  (<7)  fi 

(Loop  rolling  [4.3.2]) 

if  7l(si)(cr)  > 7Z(s3)(o-)  then 

Ad  (for  x max  (sx,s3)  to  s2  do  5 od)(<r) 

else  Ad(for  x :=  max  (s1?  sz)  to  s2  do  S od)  (a)  fi 

(*(*)(*)  > *(*»)(*)) 

Ad(f or  x :=  max  (slvs3)  to  s2  do  S od)(cr) 

(Meaning  of  if) 


CHAPTER  6 

A PROTOTYPE  OPTIMIZER 


The  previous  chapters  provide  the  necessary  denotational  semantic  background 
for  performing  code  optimizations.  This  background  certainly  provides  the  provable 
correctness  of  the  transformations  and  related  optimizations.  Still,  it  remains  to  be 
shown  that  these  transformations  can  be  implemented  in  a reasonable  fashion. 

This  research  has  revealed  that  they  can  indeed  be  implemented  to  give  the  basic 
tools  for  a person  to  use  to  perform  a variety  of  potential  optimizations  and  evaluate 
the  results.  This  chapter  will  describe  the  Heursitic  Optimizing  Prototype  System 
(HOPS),  a LISP  system  that  provides  all  of  the  primitive  and  global  optimizations 
discussed  in  Chapters  4 and  5.  This  chapter  gives  a brief  description  of  HOPS,  start- 
ing with  an  overview  of  the  system  in  Section  6.1.  In  developing  the  system  the  need 
for  a timer  for  the  code  and  a better  approximation  of  when  two  sets  of  variables 
are  always  separate  became  apparent.  The  timer  which  resulted  is  discussed  in  Sec- 
tion 6.2  and  the  refinement  of  static  approximation  for  always  separate  is  presented 
in  Section  6.3.  Once  the  basic  transformations  were  implemented,  it  was  necessary 
to  write  programs  to  combine  them  in  an  attempt  to  optimize  code.  A discussion 
of  these  heuristic  programs  is  in  Section  6.4.  Finally,  a large  example  of  HOPS  in 
action,  optimizing  a program  to  do  image  histograms,  is  presented  in  Section  6.5. 

6.1  The  System  and  Its  Data  Structures 

HOPS  is  an  interactive  LISP  system,  implemented  in  XLISP  running  under  UNIX, 
on  a Sun  3/280  and  a Gould  Powernode  9080.  It  has  functions  that  perform  each  of 
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the  primitive  transformations  as  described  in  Chapter  4 and  each  of  the  optimizing 
transformations  as  described  in  Chapter  5.  It  also  provides  functions  to  aid  a hu- 
man optimizer,  including  statement  and  expression  construction,  extraction  of  parts 
of  statements  and  expressions,  statement  and  expression  validity  checks,  peephole 
optimizations,  and,  probably  most  importantly,  statement  timing.  The  timer  is  dis- 
cussed in  more  detail  in  Section  6.2  and  the  other  functions  are  described  later  in 
this  section. 

Simple  variables  and  constants  in  this  language  are  LISP  atoms  and  indexed 
variables,  expressions  and  statements  are  lists.  Variables  must  be  declared  in  HOPS 
programs,  unlike  programs  in  the  language  described  in  Chapter  3.  This  can  be  done 
with  the  function  MakeVariable.  Statements  and  expressions  are  stored  in  prefix 
form.  The  format  for  storing  each  of  these  is  given  in  Table  6.1. 

In  order  to  simplify  some  testing,  abstract  statements  and  expressions  are  al- 
lowed. Thus,  if  the  user  is  interested  in  examining  if  statement  transformations, 
the  user  need  not  enter  complete  statements  for  each  clause,  but  may  instead  enter 
abstract  statements  for  these  unimportant  clauses.  For  example,  a user  interested  in 
loop-conditional  joining  may  not  want  to  consider  the  statement  in  the  then  clause 
of  the  conditional  being  joined.  In  this  case,  an  abstract  statement  could  be  used. 
Similarly,  someone  trying  to  explore  if  statement  simplification  as  described  in  The- 
orem 4.2.5  could  use  an  abstract  boolean  expression  rather  than  some  real  expression. 
Abstract  values  are  assumed  to  set  and  use  no  variables.  These  abstract  values  are 
atoms  and  must  be  declared  by  the  user  using  the  functions  MakeAbstractStatement, 
MakeAbstractBoolean, and  MakeAbstractExpression. 

The  functions  in  HOPS  each  take  a single  statement  as  a parameter  and  return 
the  statement  after  execution  of  the  transformation,  if  the  transformation  is  valid. 
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Table  6.1.  HOPS  Equivalents  for  Language  Constructs 


Language  Construct HOPS  Equivalent 


X 

a[s] 

Integer  Variables 

X 

(a  s) 

m 

Integer  Expressions 

m 

si  op  s2 

(op  si  s2) 

legal  ops  are  + , - , * , / 

if  b then  si 

else  s2  fi  (if  b si  s2) 

true 

Boolean  Expressions 

true 

false 

false 

si  op  s2 

(op  si  s2) 

legal  ops  are  <,  >,  <=, 

(not  b) 

>=,  =,  <> 

s in  sl...s2 

(in  s (si  s2)) 

v :=  s 

Statements 

(:=  v s) 

SI;  S2 

(SI  S2) 

if  b then  SI 

else  S2  fi  (if  b SI  S2) 

D 

0 

for  x :=  si  ■ 

to  s2  do  S od  (for  x si  s2  S) 
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There  are  a number  of  functions  allowing  the  user  to  extract  single  statements  from 
structured  statements,  making  it  easier  to  send  the  single  statement  expected  here  as 
a parameter.  These  are  relatively  straightforward  and  are  omitted  from  the  present 
discussion.  The  important  transforming  functions  include  the  following. 

Interchange.  This  function  will  interchange  the  first  two  statements  in  a com- 
pound statement. 

InterchangeWithSubstitution.  This  function  interchanges  the  first  two  statements 
of  a compound  list.  The  first  statement  must  be  an  assignment  statement  and  the 
value  assigned  to  the  left-hand-side  is  substituted  for  the  left-hand-side  in  the  second 
statement. 

InterchangeWithBackSubstitution.  This  function  performs  interchange  with 
backward  substitution  as  described  in  Theorem  4.1.5. 

LoopUnroll.  This  function  unrolls  a statement  from  the  back  of  a for  statement. 
Currently,  this  is  only  done  if  the  loop  bounds  are  constants — there  is  no  attempt 
made  to  evaluate  variable  loop  bounds. 

Absorbln.  This  function  will  attempt  to  absorb  statements  into  an  if  statement 
from  either  before  or  after  the  if  statement.  If  only  absorption  of  statements  following 
the  if  statement  is  desired,  Absorblnl  can  be  used,  while  AbsorbIn2  only  attempts 
to  absorb  statements  from  before  the  if  statement. 

ExtractOut.  This  function  removes  statements  from  both  branches  of  an  if  state- 
ment to  either  before  or  after  the  statement.  ExtractOut  1 and  Extract0ut2  will 
move  statements  to  after  or  before  the  if  statement  respectively. 

AbsorbWithSubstitution.  This  function  absorbs  assignment  statements  before  if 
statements  into  the  if  statement,  substituting  the  right-hand-side  for  the  left  hand 
side  in  the  if  statement  condition  as  necessary. 
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IfSplit.  This  function  divides  an  if  then  else  statement  into  two  if  then  state- 
ments. 

Looplnterchange.  This  function  will  interchange  the  boundaries  of  nested  loops. 
Loop  Join.  This  function  will  join  a pair  of  for  loops  with  the  same  range. 
LoopSplit.  This  function  will  split  a single  for  loop  into  a pair  of  for  loops  with 
the  same  range. 

MoveCode.  This  function  will  remove  loop  invariant  statements  to  before  for 
loops. 

LCJoin.  This  function  will  convert  a loop  statement  with  a nested  conditional 
statement  into  a loop  statement  with  altered  bounds. 

StatementSimplify.  This  function  performs  a variety  of  peephole  optimizations. 
Each  of  the  functions  of  HOPS  is  nondestructive.  This  enables  a user  to  assign 
the  value  of  a statement  to  a variable  and  then  to  try  a variety  of  transformations  on 
the  variable,  being  sure  of  always  starting  with  the  same  statement.  A typical  call 
sequence  might  then  be: 

(setq  testprog  '(for  x 1 10  ( (:=  y 1)  (:=  x (+  y x))  ) ) ) 

Give  the  test  program  a value. 

(time  testprog)  Compute  the  original  time  it  takes. 

(setq  tryl  (MoveCode  testprog)) 

Attempt  code  motion. 

(time  tryl)  Compute  the  time  the  transformed  program  takes, 

(setq  try2  (LCJoin  testprog)) 

Attempt  loop-conditional  joining. 

(time  try2)  Compute  the  time  the  transformed  program  takes. 
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Notice  that  the  user  tried  to  perform  loop-conditional  joining  when  it  was  not 
possible  (since  there  is  no  conditional).  HOPS  recognizes  this  and  will  not  alter 
the  statement,  so  the  time  for  the  original  statement  and  the  statement  after  the 
loop-conditional  joining  will  be  the  same. 

6.2  A Parameterized  Timer 

The  functions  given  in  the  previous  section  provide  the  user  with  the  ability  to 
attempt  a variety  of  transformations,  but  there  is  nothing  in  HOPS  which  will  tell 
the  user  when  one  transformation  provides  improved  code.  Instead,  HOPS  provides  a 
pair  of  timers,  one  which  returns  a symbolic  time  and  the  other  which  simply  returns 
a count  of  time  units,  which  the  user  can  then  use  to  determine  the  benefit  (or  harm) 
of  the  transformation. 

Since  many  systems  differ  in  the  amount  of  time  it  may  take  to  perform  operations, 
these  timers  are  based  on  a set  of  constants,  any  of  which  may  be  changed  by  the 
user  to  better  represent  relative  speeds  of  the  user’s  actual  system.  An  optimizing 
system  based  on  HOPS  could  then  use  the  symbolic  time  to  determine  whether  or 
not  to  apply  a particular  transformation.  Additionally,  there  are  two  functions  to 
return  the  time  of  boolean  and  integer  operators,  so  that  different  operators  may  be 
assigned  different  times  (as  is  so  often  the  case  in  actual  systems).  When  the  bounds 
of  a loop  can  be  statically  evaluated,  the  actual  number  of  iterations  in  the  loop  will 
be  computed,  but  if  the  loop  bounds  are  expressions,  another  default  will  give  the 
number  of  times  through  the  loop.  Figure  6.2  shows  the  symbolic  times  of  both  the 
original  and  altered  statements  in  the  example  at  the  end  of  the  previous  section. 

6.3  A New  Approximation  of  Always  Separate 

Using  ivar  and  livar  as  the  static  approximation  of  sets  and  uses  in  Section  3.9 
proved  inadequate  in  HOPS.  While  the  approximation  was  fine  for  simple  variables, 
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Original  statement:  (for  x 1 10  ( ( : = y 1)  ( : = x (+  y x) ) ) ) ) 
Its  symbolic  time: 


(+  ForStatementTime 

O (+  1 (-  10  1)) 

(+  CompoundStatementTime 

(+  (+  AssignmentStatementTime 

Const antTime 

) 

(+  AssignmentStatementTime 
C+  BinaryFunctionTime 
VariableTime 
VariableTime 

) 


Statement  transformed  by  MoveCode: 

((  :=  y 1)  (for  x 1 10  ( (:-  x (+  y x))  ) ) ) 
Its  symbolic  time: 

(+  CompoundStatementTime 

(+  AssignmentStatementTime 
ConstantTime 

) 

(+  ForStatementTime 

(*  (+  1 (-  10  1)) 

(+  BinaryFunctionTime 
VariableTime 
VariableTime 

) 

) 

) 

) 


Figure  6.1.  Some  Symbolic  Statement  Times 


87 


it  was  far  too  broad  to  say  that  the  entire  array  was  set  when  in  fact  only  one 
element  of  it  may  have  been  set.  Too  many  image  processing  functions  work  on 
arrays,  usually  going  through  them  in  some  specific  order  each  time.  Because  of  this, 
a new  representation  of  intermediate  variables  was  used  in  HOPS. 

Simple  variables  are  still  stored  as  simple  variables  in  the  intermediate  form. 
Array  variables  are  now  stored  as  both  the  array  and  some  index  information.  This 
array  index  information  is  stored  in  one  of  five  formats,  depending  on  how  the  array 
reference  appears  in  the  program.  These  index  types  are  described  below. 

Constants.  If  the  array  is  indexed  by  a constant  or  constant  expression  (and  thus 
the  reference  is  of  the  form  a[m]),  the  index  will  be  stored  as  the  constant  m. 

Variables.  If  the  array  is  indexed  by  a variable  (and  thus  the  reference  is  of  the 
form  a[x]),  the  index  will  be  stored  as  the  variable  x. 

Expressions.  If  the  array  is  indexed  by  a variable  expression  (and  thus  the  refer- 
ence is  of  the  form  a [s]),  the  index  will  be  stored  as  the  expression  s (where  expressions 
are  stored  as  discussed  in  Section  6.1). 

Constant  subranges.  If  the  array  reference  is  in  a for  loop  with  constant  bounds 
and  is  indexed  by  the  loop  control  variable,  the  index  will  be  stored  as  a subrange  of 
the  constants  in  the  form  (lower-bound  upper-bound). 

Variable  subranges.  If  the  array  reference  is  in  a for  loop  with  nonconstant 
bounds  or  is  indexed  by  a function  of  the  loop  control  variable,  the  index  will 
be  stored  as  a subrange  of  the  expressions  in  the  form  (lower-bound-expression 
upper-bound-expression). 

As  before,  any  two  simple  variables  are  always  separate  if  they  have  different 
names  and  any  simple  variable  is  always  separate  from  any  array  variable.  When  two 
intermediate  variables  are  array  references  referring  to  the  same  array,  it  is  necessary 
to  check  the  index  information.  In  some  cases,  it  may  be  possible  to  determine  from 
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Table  6.2.  When  Two  Index  Types  May  Be  Always  Separate 


Type  of  first  index: 


Type  of 
second  index 

Constant 

Variable 

Expressions 

Constant 

Subrange 

Variable 

Subrange 

Constant 

Maybe 

Never 

Never 

Maybe 

Never 

Variable 

Never 

Never 

Maybe 

Never 

Never 

Expression 

Never 

Maybe 

Maybe 

Never 

Never 

Constant 

Subrange 

Maybe 

Never 

Never 

Maybe 

Never 

Variable 

Subrange 

Never 

Never 

Never 

Never 

Maybe 

the  indices  that  the  array  references  are  always  separate.  Table  6.2  tells  in  which 
cases,  based  on  the  index  values,  array  references  may  be  always  separate. 

In  the  cases  in  Table  6.2  marked  Maybe , HOPS  will  check  the  values  of  the  two 
indices  to  determine  if  they  can  positively  be  declared  to  be  always  separate.  If,  for 
example,  the  first  index  were  a constant  and  the  second  were  a constant  subrange, 
HOPS  would  only  have  to  check  if  the  constant  fell  into  the  subrange.  If  it  did  not, 
the  two  array  references  could  safely  be  declared  always  separate.  The  spaces  marked 
Never  do  not  mean  that  there  is  no  way  for  the  references  to  be  always  separate,  only 
that  HOPS  cannot  determine  statically  if  they  were.  Certainly,  two  arrays  referenced 
by  variables  may  be  references  to  different  locations,  but  with  no  information  about 
the  values  of  the  indexing  expressions,  HOPS  cannot  conclude  this. 

6.4  Some  Heuristic  Programs  Using  the  System 

A user  could  certainly  work  with  these  transformations  to  transform  code  in 
their  raw  form,  but  they  are  at  the  level  of  assembly  language  programming.  In 
order  to  assist  the  user,  some  functions  have  been  added  to  attempt  larger  scale 
transformations.  Thus  HOPS  contains  some  heuristics  for  code  optimizations.  These 
functions  are  described  in  this  section. 
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Probably  the  simplest  of  these  heuristics  is  DoAllProp,  which  will  perform  for- 
ward copy  propagation  wherever  possible.  Forward  copy  propagation  is  done  by 
interchanging  with  substitution  all  statements  of  the  form  x :=  v or  x :=  m,  to 
move  them  as  far  backward  in  the  code  as  possible.  It  may  also  involve  absorp- 
tion into  if  statements.  The  function  PropagateAndSimplify  combines  this  with 
StatementSimplif y to  perform  both  peephole  simplifications  and  forward  copy  state- 
ment propagation  concurrently.  Backward  copy  propagation,  possible  as  a result  of 
Interchange  with  Backward  Substitution  (Lemma  4.1.5),  can  be  done  with  the  func- 
tion BackPropagate.  Only  assignment  statements  with  an  array  reference  on  the 
left-hand-side  will  be  propagated  backwards. 

Statement  compression  helps  RemoveDeadVars  with  the  removal  of  dead  state- 
ments, those  whose  results  are  no  longer  needed  by  the  rest  of  the  program.  Statement 
compression  alone  does  not  provide  dead  variable  elimination.  In  order  to  determine 
which  variables  are  live  at  any  point  in  a program,  there  must  be  some  notion  of  the 
output  variables  of  that  program.  All  previous  results  have  considered  states  to  be 
equal  if  they  agree  in  the  values  assigned  to  all  variables,  not  just  a group  of  output 
variables.  It  is  conceivable  to  discuss  equality  of  states  restricted  to  a set  of  variables, 
but  is  outside  the  scope  of  this  research. 

There  are  also  functions  to  optimize  the  different  structures,  such  as 
CompoundOptimize  and  ForOptimize.  It  is  here  that  various  ways  of  combining 
the  primitive  and  global  transformations  can  be  explored.  I conjecture  that  the 
problem  of  determining  exactly  which  set  of  transformations  will  best  improve  the 
running  time  of  any  piece  of  code  for  any  set  of  timer  parameters  is  most  certainly 
undecidable.  However,  work  has  been  done  to  devise  some  general  rules  for  applying 
the  traditional  global  data  flow  transformations  [6].  This  work  has  been  adapted  for 
use  by  HOPS. 


(def  ForOptimize 

(lambda  (Stat  LiveVars) 

(prog  (InnerStat  NewStat  SecondStat) 


> Simplify  the  entire  statement  and  eliminate  it  if  possible, 
(setq  Stat 

(StatementSimplify  Stat)) 

(cond  ( (not  (IsFor  Stat)) 

(return  Stat)) 

) 

» Then,  do  Backward  propagation  (since  CompoundOptimize  won't 
; know  the  compound  is  nested  in  an  IF) 

(setq  InnerStat  (GetForStat  Stat)) 

(setq  InnerStat 

(BackPropagate  InnerStat  (GetLCV  Stat))) 

; Now  optimize  the  inner  statements 
(setq  InnerStat 

(CompoundOptimize  InnerStat  LiveVars)) 

; Restructure  the  statement 
(setq  NewStat 
(MakeFor 

(GetLCV  Stat) 

(GetLowBound  Stat) 

(GetHighBound  Stat) 

InnerStat 


)) 
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)) 


Attempt  code  motion  from  the  loop 
(setq  NewStat 

(ForceMoveCode  NewStat)) 
Attempt  loop-conditional  joining 
(setq  NewStat 

(ForceLCJoin  NewStat)) 
(return  (StatementSimplify  NewStat)) 

) 


Figure  6.2.  A HOPS  Program  to  Optimize  the  Histogram  Program 
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A version  of  ForOptimize  is  in  Figure  6.2.  It  begins  by  simplifying  the  entire 
statement  with  peephole  optimizations.  If  this  results  in  a statement  which  is  not  a 
loop  (either  because  the  statement  in  the  for  loop  is  nullable,  or  because  the  loop 
boundaries  are  computable  and  the  lower  bound  is  greater  than  or  equal  to  the  up- 
per bound),  the  optimization  of  the  for  loop  stops  there.  This  initial  step  is  time 
consuming  and  may  not  be  needed  for  some  for  statements,  but  was  determined 
to  be  necessary  in  the  simplification  of  the  histogram  program  in  the  next  section. 
Next,  before  doing  global  transformations  to  the  nested  statement,  backward  copy 
propagation  is  done.  Only  statements  with  array  references,  referenced  by  the  loop 
control  variable  are  propagated  backwards.  Then,  CompoundOptimize  is  employed 
to  perform  global  transformations,  such  as  elimination  of  dead  variables  and  state- 
ment compression,  on  the  nested  statement.  Once  the  nested  statement  is  improved, 
code  motion  and  loop-conditional  joining  are  attempted.  Finally,  peephole  simpli- 
fications are  repeated.  This  entire  plan  of  attack  may  constitute  overkill  for  some 
for  statements,  but  usually  does  provide  improved  code  whenever  improvements  are 
possible.  It  is  still  up  to  the  user  to  look  at  the  times  of  the  code  with  and  without 
the  improvements  to  determine  if  the  improvements  were  actually  beneficial. 

There  are  also  a variety  of  functions  provided  to  extract  statements  of  interest 
and  greater  potential  for  optimizations  from  a compound  statement,  along  with  the 
simplifying  procedures  available  with  HOPS.  The  first  of  these  is  ExtractForStat 
which  will  find  the  first  for  statement  in  a compound  statement.  Compound  state- 
ments are  of  special  interest  because  the  amount  of  looping  in  image  processing  is  so 
great.  SplitCompoundAroundlf  will  extract  an  if  statement  which  uses  a particular 
variable  (passed  as  a parameter)  in  its  condition.  This  is  particularly  useful  in  loop- 
conditional  joining,  and  there  is  a function,  ForceLCJoin,  which  uses  this  to  attempt 
a variety  of  changes  in  the  code  to  ultimately  perform  loop-conditional  joining. 
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(for  i LowGray  HighGray 

( 

( :=  s 0) 

(for  j LowPixel  HighPixel 
(if  (=  i (a  j)) 

(:=  s (+  s 1)) 
(:=  s (+  s 0)) 

) 

) 

(:=  (h  i)  s) 

) 

) 


Figure  6.3.  A Straightforward  Implementation  of  the  Histogram 

While  none  of  these  heuristics  is  altogether  complicated  (and  they  all  leave  it 
to  the  user  to  determine  if  indeed  the  transformation  is  beneficial),  they  do  as  a 
group  show  how  extensible  the  basic  functions  of  HOPS  are  and  indicate  some  of  its 
potential  to  become  a truly  intelligent  code  optimizer. 

6.5  A Large  Example:  The  Histogram 

As  an  example  of  the  abilities  of  HOPS,  even  in  its  present  form,  consider  a 
program  to  compute  the  histogram  of  an  image.  Determining  the  histogram  of  gray 
levels  in  an  image  is  important  for  a variety  of  techniques  such  as  enhancement  by 
histogram  equalization  over  a wider  range  of  gray  levels  and  image  segmentation  [13]. 
To  determine  the  histogram  h of  an  image  a over  the  gray  levels  min-gray-level  to 
max-gray-level  in  the  image  algebra,  the  following  expression  is  used: 

for  i in  min-gray-level  to  max-gray-level  do 

hi  «-  £(xi(a)) 


end  for 
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(for  i 0 255 

(:=  (h  i)  0) 

) 

(for  j 0 4095 

(for  i (max  0 (a  j))  (min  255  (a  j)) 
(:=  (h  i) 

(+  (h  i)  1) 

) 

) 


) 


Figure  6.4.  The  Resulting  Histogram  Program 

A straightforward  implementation  of  this  algorithm  is  shown  in  Figure  6.3.  As 
stated  it  is  extremely  inefficient.  It  requires  extra  space  to  hold  each  of  the  new  char- 
acteristic function  images.  It  also  requires  looping  over  the  size  of  the  original  image 
2n  times  (where  n is  the  number  of  gray  levels) — once  to  compute  the  characteristic 
function  and  once  to  sum  it.  While  standard  optimizations  may  improve  this  code, 
they  cannot  eliminate  the  amount  of  looping. 

The  HOPS  program  ForOptimize,  discussed  in  the  previous  section  and  presented 
in  Figure  6.2,  can  be  used  to  optimize  this  histogram  program.  The  resulting  code  is 
in  Figure  6.4.  (Since  LowGray,  HighGray,  LowPixel,  and  HighPixel  are  all  constants, 
they  were  replaced  by  their  corresponding  values  during  the  simplification.) 

HOPS  has  demonstrated  the  potential  to  be  an  important  assistant  to  human  code 
optimizers.  It  still  is  fairly  primitive  and  relies  strongly  on  the  user,  both  to  direct 
its  search  and  to  determine  when  a transformation  is  beneficial.  It  has  not  yet  been 
incorporated  into  any  of  the  image  algebra  programming  languages,  and  this  would 
seem  to  be  a logical  next  step.  Still,  its  success  at  this  level  indicates  the  potential 
of  approaching  global  optimization  as  a collection  of  primitive  transformations. 


CHAPTER  7 
CONCLUSIONS 


This  dissertation  presents  a new  approach  to  global  optimizations.  Rather  than 
collect  a variety  of  information  about  each  statement  and  perform  large  transfor- 
mations when  the  conditions  are  correct,  I collect  a small  amount  of  information 
(only  the  sets  and  uses  for  the  statement)  and  perform  small  transformations,  which 
can  then  be  combined  to  form  large  optimizations.  By  performing  optimization  in 
smaller  pieces,  it  is  easier  to  show  that  each  of  the  pieces  is  correct.  I have  described 
a small  language  and  have  given  its  denotational  definition.  With  this  definition,  I 
have  proven  that  each  of  the  primitive  transformations  preserves  the  meaning  of  the 
statement. 

These  primitive  transformations  can  be  combined  to  give  global  optimizations. 
These  optimizations  include  some  of  the  traditional  global  data  flow  optimizations 
such  as  code  motion  and  copy  propagation,  along  with  some  previously  unexploited 
optimizations  such  as  loop-conditional  joining  and  backward  copy  propagation. 

The  Heuristic  Optimizing  Prototype  System  implements  all  of  these  primitive 
transformations  and  global  optimizations.  It  allows  a user  to  experiment  with  a 
variety  of  optimizing  strategies.  It  also  has  functions  developed  to  assist  the  user. 
These  functions  will  attempt  to  rearrange  the  program  so  that  some  of  the  more 
beneficial  optimizations  (such  as  code  motion  and  loop-conditional  joining)  can  occur. 
The  HOPS  timer  is  configurable,  enabling  the  user  to  adjust  timer  parameters  to  best 
represent  the  system  being  optimized. 
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There  are  three  areas  in  which  this  dissertation  would  most  logically  be  extended. 
The  first  is  in  the  design  of  the  language  for  the  proofs.  As  it  is  now,  the  language 
is  not  Turing  equivalent.  While  this  would  be  desirable,  it  is  not  necessary  for  image 
algebra  programs.  It  would  also  have  greatly  complicated  the  proofs  presented  here. 
The  language  could  be  made  to  have  the  power  of  a Turing  machine  by  adding  either 
loops  which  were  not  bounded  at  the  time  of  entrance  to  the  loop  or  by  adding 
subprograms.  If  unbounded  loops  were  introduced,  the  possibility  of  errors  could 
no  longer  be  pushed  aside  because  infinite  loops  are  a real  possibility.  Subprograms 
would  need  to  have  some  sort  of  a parameter  passing  mechanism  defined.  Side  effects 
of  subprograms  may  increase  the  amount  of  aliasing  in  the  language  as  well.  These 
extensions,  although  powerful,  would  extend  this  project  beyond  what  is  necessary 
for  image  processing  and  would  greatly  increase  the  complexity  of  the  proofs  given. 

The  second  area  where  this  project  could  be  extended  is  in  the  heuristics  for 
applying  the  transformations  and  provided  with  the  HOPS  system.  Determining 
when  transformations  that  seem  to  degrade  a piece  of  code  might  actually  leave  it 
in  a position  to  be  improved  greatly  is  a fascinating  problem,  albeit  one  outside  the 
scope  of  this  dissertation.  This  appears  on  the  surface  to  be  a classical  application 
for  expert  systems.  It  would  be  interesting  to  improve  the  HOPS  system  itself  so 
that  it  could  actually  be  used  for  working  code  optimization,  or  alternatively,  as  a 
teaching  tool  for  students.  This  would  involve  some  work  on  the  interface,  additional 
heuristic  programs,  and  improvements  to  the  timer  such  as  recognizing  when  loops  are 
vectorizable  and  permitting  different  guesses  for  the  default  number  of  loop  iterations 
for  different  loops. 

Finally,  optimization  in  architectures  other  than  traditional  von  Neumann  ar- 
chitectures is  a rapidly  growing  field.  There  are  quite  a few  special  architectures 
available  for  image  processing  [11].  Many  of  the  optimizations  applied  here  are  also 
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applicable  to  intermediate  code  for  multiprocessor  machines  such  as  the  Connection 
Machine  [15],  Copy  propagation,  loop  interchange,  loop  joining,  and  loop  splitting  are 
all  optimization  techniques  applicable  to  vector  or  concurrent  computers  discussed 
by  Padua  and  Wolfe  [21]. 
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